ï»??xml version="1.0" encoding="utf-8" standalone="yes"?>中文在线观看视频,久久野战av,亚洲国产精华液http://www.aygfsteel.com/paulwong/category/53883.htmlzh-cnSun, 04 Jan 2015 17:23:29 GMTSun, 04 Jan 2015 17:23:29 GMT60HADOOP各种框架应用领域http://www.aygfsteel.com/paulwong/archive/2015/01/04/422020.htmlpaulwongpaulwongSun, 04 Jan 2015 04:57:00 GMThttp://www.aygfsteel.com/paulwong/archive/2015/01/04/422020.htmlhttp://www.aygfsteel.com/paulwong/comments/422020.htmlhttp://www.aygfsteel.com/paulwong/archive/2015/01/04/422020.html#Feedback0http://www.aygfsteel.com/paulwong/comments/commentRss/422020.htmlhttp://www.aygfsteel.com/paulwong/services/trackbacks/422020.html1. Real Time Analytics : Apache Storm
2. In-memory Analytics : Apache Spark
3. Search Analytics : Apache Elastic search, SOLR
4. Log Analytics : Apache ELK Stack,ESK Stack(Elastic Search, Log
Stash, Spark Streaming, Kibana)
5. Batch Analytics : Apache MapReduce

***** NO SQL DB *****
1. MongoDB
2. Hbase
3. Cassandra

***** SOA *****
1. Oracle SOA
2. JBoss SOA
3. TiBco SOA
4. SOAP, RESTful Webservices 

]]>
最火爆的开源流式系¾lŸStorm vs 新星Samzahttp://www.aygfsteel.com/paulwong/archive/2014/12/02/420922.htmlpaulwongpaulwongTue, 02 Dec 2014 07:03:00 GMThttp://www.aygfsteel.com/paulwong/archive/2014/12/02/420922.htmlhttp://www.aygfsteel.com/paulwong/comments/420922.htmlhttp://www.aygfsteel.com/paulwong/archive/2014/12/02/420922.html#Feedback0http://www.aygfsteel.com/paulwong/comments/commentRss/420922.htmlhttp://www.aygfsteel.com/paulwong/services/trackbacks/420922.html

分布计算¾pȝ»Ÿæ¡†æž¶åQŒæŒ‰ç…§æ•°æ®é›†çš„特ç‚ÒŽ¥è¯ß_¼Œä¸»è¦åˆ†äØ“data-flowå’Œstreaming两种。data-flowä¸»è¦æ˜¯ä»¥æ•°æ®å—äØ“æ•°æ®æºæ¥å¤„ç†æ•°æ®åQŒä»£è¡¨æœ‰åQšMR、Spark½{‰ï¼Œæˆ‘ç§°ä½œå®ƒä»¬äØ“å¤§æ•°æ®ï¼Œè€Œstreaming主要是处理单位内得到的数据,˜q™ç§æ–¹å¼åQŒæ›´æ³¨é‡äºŽå®žæ—¶æ€§ï¼Œä¸»è¦åŒ…括Strom、JStormå’ŒSamza½{‰ï¼Œæˆ‘ç§°ä½œå®ƒä»¬äØ“å¿«æ•°æ®ã€?/p>

在这½‹‡æ–‡ç« ä¸­åQŒæˆ‘主要谈论streaming相关的框架ã€?/p>

½W¬ä¸€ä¸ªæ˜¯StormåQŒä¸€ä¸ªå®žæ—¶è®¡½Ž—ç³»¾lŸï¼Œå®ƒå‡å®šæ•°æ®æºæ˜¯åŠ¨æ€çš„åQŒå¯ä»¥å‘‹¹æ°´ä¸€æ ·å¤„理数据ã€?/p>

它的特点是:低åšg˜qŸã€é«˜æ€§èƒ½ã€åˆ†å¸ƒå¼ã€å¯æ‰©å±•和容错性ã€?/p>

架构如下图所½Cºã€?/p>


 

Storm的具体概念可以参照:http://blog.csdn.net/hljlzc2007/article/details/12976211åQŒè¿™é‡Œä¸åšå…·ä½“介¾lã€?/p>

Storm目前½Ž—是最最½E›_®šçš„开源流式处理框æžÓž¼Œä½†æ˜¯ä¸ªähè®¤äØ“å®ƒæœ‰ä¸¤ä¸ªé—®é¢˜ã€?/p>

1. Storm虽然支持多个语言¾~–写spoutå’Œbolt端的代码åQŒä½†æ˜¯å®ƒçš„主要技术实现是clojureåQŒè¿™¾l™çŽ©å¤§æ•°æ®ã€å¼€æºçš„æœ‹å‹å¸¦æ¥äº†æžå¤§çš„ä¸å˜åQŒå› ä¸ºå¤§å®¶ä¼šçš„语­a€ä¸æ˜¯ä»¥javaå’ŒC++½{‰å¤§ä¼—语­a€ä¸ÞZ¸»åQŒè¿™æ ïLš„话,变得不可控了åQŒéš¾ä»¥æ·±å…¥äº†è§£ã€ä¿®æ”¹å…¶¾l†èŠ‚ã€?/p>

2. Storm可以支持在Yarn(Hadoop 2.0)上,可以和其他开源框架共享Hadoop集群的资源,但是性能不佳åQŒè¿™ä¸ªæœ‰å¾…Storm改善

当然无论如何åQŒStorm依然是目前开源流式处理框架的王者ã€?/p>

½W¬äºŒä¸ªæˆ‘惌™¯´çš„æ˜¯JStormåQŒè¿™ä¸ªæ˜¯é˜‰K‡Œåšçš„åQŒç®—是Storm的另一个实玎ͼŒå®ƒç”¨çš„语­a€æ˜¯Java.

特点åQ?/p>

1. 客户端的API与Storm基本上是一致的åQŒå¦‚果从Storm˜qç§»˜q‡æ¥åQŒä¸éœ€è¦ä¿®æ”¹boltå’Œspout的代ç ?/p>

2. Jstrom比Strom½E›_®šåQŒé€Ÿåº¦æ›´å¿«

3. 提供了一些新的特�/p>

大家有兴­‘£å¯ä»¥åŽ»çŽ©çŽ©åQŒé¡¹ç›®åœ°å€https://github.com/alibaba/jstorm 

½W¬ä¸‰ä¸ªæ˜¯Samza

Samza是由LinkedIn开源的一个技术,它是一个开源的分布式流处理¾pȝ»ŸåQŒéžå¸¸ç±»ä¼égºŽStorm。不同的是它˜qè¡Œåœ¨Hadoop之上åQŒåƈ且ä‹É用了自己开发的Kafka分布式消息处理系¾lŸã€?/p>

˜q™æ˜¯Linkin开发的一个小而美的项目,如何¾ŸŽå‘¢åQ?/p>

1. 只有几千行代码,完成的功能就可以和Storm媲美åQŒå½“然目前还有很多的不èƒö

2. å’ŒKafka¾l“合紧密åQŒæ›´æ–¹ä¾¿çš„处理数æ?/p>

3. ˜qè¡Œåœ¨Yarnä¸?/p>

之前我做˜q‡çš„一个项目,是Kafka + Storm + ElasticSearchåQŒå°†æ¥å®Œå…¨å¯ä»¥å°†Storm替换成SamzaåQŒè¿™æ ïLš„话,˜q˜å¯ä»¥åˆ©ç”¨Hadoop集群的资源,做一些存储、离¾U¿åˆ†æžçš„功能。将实时处理和离¾U¿åˆ†æžéƒ½˜qè¡Œåœ¨Hadoop上,不得不说Samza是一个伟大的™å¹ç›®åQŒè¿™æ ·å¯ä»¥å‡ž®‘项目的增长复杂度,利于¾l´æŠ¤åQŒè¿˜æ˜¯é‚£å¥è¯åQŒå°è€Œç¾Žçš„东西,更受‹Æ¢è¿Žä¸€äº›ã€?/p>

æž¶æž„åQ?/p>

Samza主要包含三层åQ?/p>

1. ‹¹å¤„理层 --> Kafka

2. 执行å±?    --> YARN

3. 处理å±?   --> Samza API

Samza的流处理层和执行层都是可插拔式的åQŒå¼€å‘äh员可以ä‹É用其他框架来替代åQŒä¸å±€é™äºŽä¸Šè¿°ä¸¤ç§æŠ€æœ¯ã€?/p>

Samza提供了一个YARN ApplicationMasteråQŒå’ŒYARN jobåQŒè¿è¡Œåœ¨é›†ç¾¤ä¹‹å¤–åQŒä¸‹å›¾ä¸­ä¸åŒé¢œè‰²ä»£è¡¨ä¸åŒçš„主机ã€?/p>

Samza客户端告诉YARNçš„Resouce ManageråQŒå®ƒæƒ›_¯åŠ¨ä¸€ä¸ªSamza jobåQ?YARN RM 告诉YARN Node manageråQŒåˆ†é…ç©ºé—´ç»™YARN ApplicationMasteråQŒNM指定完空间后åQŒYARN container会运行Samza Task Runnerã€?/p>


Samza状态管�/p>

‹¹å¼å¤„理数据对状态的½Ž¡ç†æ˜¯å¾ˆéš„¡š„åQŒç”±äºŽæ•°æ®æ˜¯‹¹åŠ¨çš„ï¼Œæœ¬èín没有状态,˜q™æ ·ž®±éœ€è¦é åŽ†å²æ•°æ®æ¥è®°å½•åº”ç”¨çš„åœºåˆåQŒSamza提供了一个内部的key-value数据库,它是åŸÞZºŽLevelDBåQŒè¿è¡Œçš„JVM之外的,使用它来存储历史数据。这æ ïLš„做的好处是:

1. 减少JVM的开销

2. 使用内部存储åQŒæžå¤§æé«˜çš„吞吐çŽ?/p>

3. 减少òq¶å‘操作

Samza处理‹¹ç¨‹.

下图是Samza官方¾l™çš„一例子åQŒæ ¹æ®Member ID分组åQŒè®¡½Ž—页面访问次数。入口消息分别来自Machine1ã€?åQŒå‡ºå£æ˜¯Machine3åQŒæˆ‘们可以这æ ïL†è§£ï¼Œæ¶ˆæ¯åˆ†æ•£åœ¨ä¸åŒçš„æ¶ˆæ¯¾pȝ»Ÿä¸­ï¼ˆKafkaåQ‰ï¼ŒSamza从不同的Kafka中读取topicåQŒåœ¨ž®†topic˜q›è¡Œå¤„理后,发送到Machine3åQŒè¿™é‡Œä¸åšè¿‡å¤šåˆ†è§£ï¼Œå…·ä½“可以参照官方文档ã€?/p>



™å¹ç›®åœ°å€åQ?a target="_blank" style="color: #336699; text-decoration: none;">https://github.com/apache/incubator-samza

官方文äšgåQ?a target="_blank" style="color: #336699; text-decoration: none;">http://samza.incubator.apache.org/

以上¾l™äº†æˆ‘们无限遐想åQŒStorm是否会保持领先地位,Samza能否取而代之呢åQŒæ— è®ºå¦‚ä½•ï¼Œä½œäØ“å¼€å‘è€…æ¥è¯ß_¼Œå‡ åƒè¡Œä»£ç ï¼Œæˆ‘都˜q«ä¸åŠå¾…去要è¯ÖM¸€ä¸‹äº†ã€?/p>

]]>
Auto rebalance Stormhttp://www.aygfsteel.com/paulwong/archive/2014/05/09/413479.htmlpaulwongpaulwongFri, 09 May 2014 15:48:00 GMThttp://www.aygfsteel.com/paulwong/archive/2014/05/09/413479.htmlhttp://www.aygfsteel.com/paulwong/comments/413479.htmlhttp://www.aygfsteel.com/paulwong/archive/2014/05/09/413479.html#Feedback0http://www.aygfsteel.com/paulwong/comments/commentRss/413479.htmlhttp://www.aygfsteel.com/paulwong/services/trackbacks/413479.htmlhttp://stackoverflow.com/questions/15010420/storm-topology-rebalance-using-java-code


使用Nimbus获取STORM的信�br />http://www.andys-sundaypink.com/i/retrieve-storm-cluster-statistic-from-nimbus-java-mode/
TSocket tsocket = new TSocket("localhost", 6627);
TFramedTransport tTransport = new TFramedTransport(tsocket);
TBinaryProtocol tBinaryProtocol = new TBinaryProtocol(tTransport);
Nimbus.Client client = new Nimbus.Client(tBinaryProtocol);
String topologyId = "test-1-234232567";


try {

tTransport.open();
ClusterSummary clusterSummary = client.getClusterInfo();
StormTopology stormTopology = client.getTopology(topologyId);
TopologyInfo topologyInfo = client.getTopologyInfo(topologyId);
List<ExecutorSummary> executorSummaries = topologyInfo.get_executors();

List<TopologySummary> topologies = clusterSummary.get_topologies();
for(ExecutorSummary executorSummary : executorSummaries){

String id = executorSummary.get_component_id();
ExecutorInfo executorInfo = executorSummary.get_executor_info();
ExecutorStats executorStats = executorSummary.get_stats();
System.out.println("executorSummary :: " + id + " emit size :: " + executorStats.get_emitted_size());
}
catch (TTransportException e) {
e.printStackTrace();
catch (TException e) {
e.printStackTrace();
catch (NotAliveException e) {
e.printStackTrace();
}






]]>
‹¹…释STORMhttp://www.aygfsteel.com/paulwong/archive/2014/05/09/413476.htmlpaulwongpaulwongFri, 09 May 2014 14:56:00 GMThttp://www.aygfsteel.com/paulwong/archive/2014/05/09/413476.htmlhttp://www.aygfsteel.com/paulwong/comments/413476.htmlhttp://www.aygfsteel.com/paulwong/archive/2014/05/09/413476.html#Feedback0http://www.aygfsteel.com/paulwong/comments/commentRss/413476.htmlhttp://www.aygfsteel.com/paulwong/services/trackbacks/413476.html
处理的方式有各种自定义:

  1. 可自定义消息处理的步�br />
  2. 可自定义每种¾cÕdž‹çš„æ¶ˆæ¯éœ€è¦å¤šž®‘个˜q›ç¨‹æ¥å¤„ç?br />
  3. 每个步骤里的消息是在某个˜q›ç¨‹é‡Œçš„¾U¿ç¨‹æ¥åšå¤„理çš?br />
  4. 可自定义每个步骤里的消息的线½E‹æ•°

  5. 可以增加和删除要处理的消息类�
如果要处理某¿Uæ¶ˆæ¯äº†åQŒè¦æ€Žä¹ˆåŠžå‘¢åQ?br />
  1. 定义数据来源¾l„äšg(SPOUT)

  2. 定义处理步骤(BOLT)

  3. ¾l„合成一个消息处理流½E‹æ¡†æž¶TOPOLOGY

  4. 定义处理消息的进½E‹çš„æ•°é‡ã€å®šä¹‰æ¯ä¸ªæ­¥éª¤åƈ发时可用的线½E‹æ•°

  5. 部çÖvTOPOLOGY
当一个TOPOLOGY被部¾|²åˆ°STORMæ—Óž¼ŒSTORM会查æ‰ùN…¾|®å¯¹è±¡çš„WORKER数量åQŒæ ¹æ®è¿™ä¸ªæ•°é‡ç›¸åº”的启动N个JVMåQŒç„¶åŽæ ¹æ®æ¯ä¸ªæ­¥éª¤é…¾|®çš„NUMTASKS生成相应个数的线½E‹ï¼Œç„¶åŽæ¯ä¸ªæ­¥éª¤ä¸­é…¾|®çš„æ•°é‡å®žä¾‹åŒ–相应个数的对象åQŒç„¶åŽå°±å¯åŠ¨ä¸€ä¸ªçº¿½E‹ä¸æ–­çš„æ‰§è¡ŒSPOUT中的nextTuple()æ–ÒŽ³•åQŒå¦‚果这个方法中有输出结果,ž®±å¯åŠ¨å¦ä¸€¾U¿ç¨‹åQŒåƈ在此¾U¿ç¨‹ä¸­å°†˜q™ä¸ª¾l“æžœä½œäØ“å‚æ•°ä¼ åˆ°ä¸‹ä¸€ä¸ªå¯¹è±¡çš„excueæ–ÒŽ³•中ã€?br />
如果此时又有一个步骤BOLT需要执行的话,也是新取一个线½E‹åŽ»æ‰§è¡ŒBOLT中的æ–ÒŽ³•启动的线½E‹ä¸ä¼šè¶Š˜q‡NUMTASKS的数量ã€?br />




]]>
Storm performancehttp://www.aygfsteel.com/paulwong/archive/2014/05/08/413391.htmlpaulwongpaulwongThu, 08 May 2014 01:19:00 GMThttp://www.aygfsteel.com/paulwong/archive/2014/05/08/413391.htmlhttp://www.aygfsteel.com/paulwong/comments/413391.htmlhttp://www.aygfsteel.com/paulwong/archive/2014/05/08/413391.html#Feedback0http://www.aygfsteel.com/paulwong/comments/commentRss/413391.htmlhttp://www.aygfsteel.com/paulwong/services/trackbacks/413391.htmlThe configuration is used to tune various aspects of the running topology. The two configurations specified here are very common:

  1. TOPOLOGY_WORKERS (set with setNumWorkers) specifies how many processes you want allocated around the cluster to execute the topology. Each component in the topology will execute as many threads. The number of threads allocated to a given component is configured through the setBolt and setSpout methods. Those threadsexist within worker processes. Each worker process contains within it some number of threads for some number of components. For instance, you may have 300 threads specified across all your components and 50 worker processes specified in your config. Each worker process will execute 6 threads, each of which of could belong to a different component. You tune the performance of Storm topologies by tweaking the parallelism for each component and the number of worker processes those threads should run within.
  2. TOPOLOGY_DEBUG (set with setDebug), when set to true, tells Storm to log every message every emitted by a component. This is useful in local mode when testing topologies, but you probably want to keep this turned off when running topologies on the cluster.

There's many other configurations you can set for the topology. The various configurations are detailed on the Javadoc for Config.


Common configurations


There are a variety of configurations you can set per topology. A list of all the configurations you can set can be found here. The ones prefixed with "TOPOLOGY" can be overridden on a topology-specific basis (the other ones are cluster configurations and cannot be overridden). Here are some common ones that are set for a topology:

  1. Config.TOPOLOGY_WORKERS: This sets the number of worker processes to use to execute the topology. For example, if you set this to 25, there will be 25 Java processes across the cluster executing all the tasks. If you had a combined 150 parallelism across all components in the topology, each worker process will have 6 tasks running within it as threads.
  2. Config.TOPOLOGY_ACKERS: This sets the number of tasks that will track tuple trees and detect when a spout tuple has been fully processed. Ackers are an integral part of Storm's reliability model and you can read more about them onGuaranteeing message processing.
  3. Config.TOPOLOGY_MAX_SPOUT_PENDING: This sets the maximum number of spout tuples that can be pending on a single spout task at once (pending means the tuple has not been acked or failed yet). It is highly recommended you set this config to prevent queue explosion.
  4. Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS: This is the maximum amount of time a spout tuple has to be fully completed before it is considered failed. This value defaults to 30 seconds, which is sufficient for most topologies. SeeGuaranteeing message processing for more information on how Storm's reliability model works.
  5. Config.TOPOLOGY_SERIALIZATIONS: You can register more serializers to Storm using this config so that you can use custom types within tuples.

Reference:
http://storm.incubator.apache.org/documentation/Running-topologies-on-a-production-cluster.html

storm rebalance 命ä×o调整topologyòq¶è¡Œæ•°åŠé—®é¢˜åˆ†æž
http://blog.csdn.net/jmppok/article/details/17243857

flume+kafka+storm+mysql 数据‹¹?br />http://blog.csdn.net/jmppok/article/details/17259145



http://storm.incubator.apache.org/documentation/Tutorial.html

]]>
安装STORMhttp://www.aygfsteel.com/paulwong/archive/2014/05/04/413230.htmlpaulwongpaulwongSun, 04 May 2014 10:01:00 GMThttp://www.aygfsteel.com/paulwong/archive/2014/05/04/413230.htmlhttp://www.aygfsteel.com/paulwong/comments/413230.htmlhttp://www.aygfsteel.com/paulwong/archive/2014/05/04/413230.html#Feedback0http://www.aygfsteel.com/paulwong/comments/commentRss/413230.htmlhttp://www.aygfsteel.com/paulwong/services/trackbacks/413230.html
  • install ZeroMQ
    wget http://download.zeromq.org/historic/zeromq-2.1.7.tar.gz
    tar -xzf zeromq-2.1.7.tar.gz
    cd zeromq-2.1.7
    ./configure
     //在configure时可能会报缺包,安装卛_¯åQšsudo apt-get install g++ uuid-dev
    make
    sudo make install
  • install JZMQ
    git clone https://github.com/nathanmarz/jzmq.git
    cd jzmq
    ./autogen.sh
    ./configure
    make
    sudo make install

  • 下蝲òq¶è§£åŽ‹STORM

  • ¾~–辑conf/storm.yaml
    storm.zookeeper.servers:
    "1.2.3.5"
    "1.2.3.6"
    "1.2.3.7"
    storm.local.dir: "/opt/folder"
    nimbus.host: "54.72.4.92"
    supervisor.slots.ports:
    6700
    6701
    6702
  • ¾~–辑/etc/profile
    export JAVA_HOME=/usr/lib/jvm/java-7-oracle
    export STORM_HOME=/home/ubuntu/java/storm-0.8.1
    export KAFKA_HOME=/home/ubuntu/java/kafka_2.9.2-0.8.1.1
    export ZOOKEEPER_HOME=/home/ubuntu/java/zookeeper-3.4.6

    export PATH=$JAVA_HOME/bin:$STORM_HOME/bin:$KAFKA_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH

  • 制作启动命ä×o: start-storm.sh
    storm nimbus &
    storm supervisor &
    storm ui &

  • 安装途中如果遇到问题
    http://my.oschina.net/mingdongcheng/blog/43009

    ]]>
    STORM启动与部¾|²TOPOLOGYhttp://www.aygfsteel.com/paulwong/archive/2013/09/11/403942.htmlpaulwongpaulwongWed, 11 Sep 2013 03:00:00 GMThttp://www.aygfsteel.com/paulwong/archive/2013/09/11/403942.htmlhttp://www.aygfsteel.com/paulwong/comments/403942.htmlhttp://www.aygfsteel.com/paulwong/archive/2013/09/11/403942.html#Feedback0http://www.aygfsteel.com/paulwong/comments/commentRss/403942.htmlhttp://www.aygfsteel.com/paulwong/services/trackbacks/403942.html
  • 启动ZOOPKEEPER
    zkServer.sh start
  • 启动NIMBUS
    storm nimbus &
  • 启动SUPERVISOR
    storm supervisor &
  • 启动UI
    storm ui &
  • 部çÖvTOPOLOGY
    storm jar /opt/hadoop/loganalyst/storm-dependend/data/teststorm-1.0.jar teststorm.TopologyMain /opt/hadoop/loganalyst/storm-dependend/data/words.txt
  • 删除TOPOLOGY
    storm kill {toponame}
  • ‹È€‹z»TOPOLOGY
    storm active {toponame}
  • 不激‹z»TOPOLOGY
    storm deactive {toponame}
  • 列出所有TOPOLOGY
    storm list





  • ]]>
    STORM资源http://www.aygfsteel.com/paulwong/archive/2013/09/08/403826.htmlpaulwongpaulwongSun, 08 Sep 2013 11:59:00 GMThttp://www.aygfsteel.com/paulwong/archive/2013/09/08/403826.htmlhttp://www.aygfsteel.com/paulwong/comments/403826.htmlhttp://www.aygfsteel.com/paulwong/archive/2013/09/08/403826.html#Feedback0http://www.aygfsteel.com/paulwong/comments/commentRss/403826.htmlhttp://www.aygfsteel.com/paulwong/services/trackbacks/403826.htmlhttp://www.jansipke.nl/installing-a-storm-cluster-on-centos-hosts/
    http://www.cnblogs.com/kemaswill/archive/2012/10/24/2737833.html
    http://abentotoro.blog.sohu.com/197023262.html
    http://www.cnblogs.com/panfeng412/archive/2012/11/30/how-to-install-and-deploy-storm-cluster.html


    使用 Twitter Storm 处理实时的大数据
    http://www.ibm.com/developerworks/cn/opensource/os-twitterstorm/


    Storm数据‹¹æ¨¡åž‹çš„分析及讨è®?br />http://www.cnblogs.com/panfeng412/archive/2012/07/29/storm-stream-model-analysis-and-discussion.html
    http://www.cnblogs.com/panfeng412/tag/Storm/


    storm-kafka
    https://github.com/nathanmarz/storm-contrib/tree/master/storm-kafka


    使用Storm实现实时大数据分析!
    http://www.csdn.net/article/2012-12-24/2813117-storm-realtime-big-data-analysis


    storm-deploy-aws
    https://github.com/nathanmarz/storm-deploy/wiki


    !!!知乎¾|‘站上的Twitter Storm
    http://www.zhihu.com/topic/19673110


    storm-elastic-search
    https://github.com/hmsonline/storm-elastic-search


    storm-examples
    https://github.com/stormprocessor/storm-examples


    kafka-aws
    https://github.com/nathanmarz/kafka-deploy


    Next Gen Real-time Streaming with Storm-Kafka Integration
    http://blog.infochimps.com/2012/10/30/next-gen-real-time-streaming-storm-kafka-integration/


    flume+kafka+storm+mysql 数据‹¹?
    http://blog.csdn.net/baiyangfu/article/details/8096088
    http://blog.csdn.net/baiyangfu/article/category/1244640


    Kafka学习½W”è®°
    http://blog.csdn.net/baiyangfu/article/details/8096084


    STORM+KAFKA
    https://github.com/buildlackey/cep


    STORM+KETTLE
    https://github.com/buildlackey/kettle-storm



    ]]>
    STORM与HADOOP的比è¾?/title><link>http://www.aygfsteel.com/paulwong/archive/2013/09/08/403824.html</link><dc:creator>paulwong</dc:creator><author>paulwong</author><pubDate>Sun, 08 Sep 2013 11:49:00 GMT</pubDate><guid>http://www.aygfsteel.com/paulwong/archive/2013/09/08/403824.html</guid><wfw:comment>http://www.aygfsteel.com/paulwong/comments/403824.html</wfw:comment><comments>http://www.aygfsteel.com/paulwong/archive/2013/09/08/403824.html#Feedback</comments><slash:comments>0</slash:comments><wfw:commentRss>http://www.aygfsteel.com/paulwong/comments/commentRss/403824.html</wfw:commentRss><trackback:ping>http://www.aygfsteel.com/paulwong/services/trackbacks/403824.html</trackback:ping><description><![CDATA[对于一堆时åˆÕdœ¨å¢žé•¿çš„æ•°æ®ï¼Œå¦‚果要统计,可以采取什么方法呢åQ?br /><ol><li>½{‰æ•°æ®å¢žé•¿åˆ°ä¸€å®šç¨‹åº¦çš„æ—¶å€™ï¼Œè·‘一个统计程序进行统计。适用于实时性要求不高的场景ã€?br />如将数据导到HDFSåQŒå†˜qè¡Œä¸€ä¸ªMAP REDUCE JOBã€?br /></li><li>如果实时性要求高的,上面的方法就不行了。因此就带来½W¬äºŒ¿Uæ–¹æ³•ã€?br />在数据每‹Æ¡å¢žé•¿ä¸€½W”的时候,ž®Þp¿›è¡Œç»Ÿè®¡JOBåQŒç»“果放到DB或搜索引擎的INDEX中ã€?br />STORMž®±æ˜¯å®Œæˆ˜q™ç§å·¥ä½œçš„ã€?/li></ol><br />HADOOP与STORM比较<br /><ol><li>数据来源åQšHADOOP是HDFS上某个文件夹下的可能是成TB的数据,STORM是实时新增的某一½W”æ•°æ?/li><li>处理˜q‡ç¨‹åQšHADOOP是分MAP阶段到REDUCE阶段åQŒSTORM是由用户定义处理‹¹ç¨‹åQ?br />‹¹ç¨‹ä¸­å¯ä»¥åŒ…含多个步骤,每个步骤可以是数据源(SPOUT)或处理逻辑(BOLT)</li><li>是否¾l“束åQšHADOOP最后是要结束的åQŒSTORM是没有结束状态,到最后一步时åQŒå°±åœåœ¨é‚£ï¼Œç›´åˆ°æœ‰æ–°<br />数据˜q›å…¥æ—¶å†ä»Žå¤´å¼€å§?/li><li>处理速度åQšHADOOP是以处理HDFSä¸Šå¤§é‡æ•°æ®äØ“ç›®çš„åQŒé€Ÿåº¦æ…¢ï¼ŒSTORM是只要处理新增的某一½W”数据即å?br />可以做到很快ã€?/li><li>适用场景åQšHADOOP是在要处理一æ‰ÒŽ•°æ®æ—¶ç”¨çš„åQŒä¸è®²ç©¶æ—¶æ•ˆæ€§ï¼Œè¦å¤„理就提交一个JOBåQŒSTORM是要处理<br />某一新增数据时用的,要讲时效æ€?br /></li><li>与MQå¯Òޝ”åQšHADOOP没有å¯Òޝ”性,STORM可以看作是有N个步骤,每个步骤处理完就向下一个MQ发送消息,<br />监听˜q™ä¸ªMQ的消费者ç‘ô¾l­å¤„ç?br /><br /></li></ol><img src ="http://www.aygfsteel.com/paulwong/aggbug/403824.html" width = "1" height = "1" /><br><br><div align=right><a style="text-decoration:none;" href="http://www.aygfsteel.com/paulwong/" target="_blank">paulwong</a> 2013-09-08 19:49 <a href="http://www.aygfsteel.com/paulwong/archive/2013/09/08/403824.html#Feedback" target="_blank" style="text-decoration:none;">发表评论</a></div>]]></description></item></channel></rss> <footer> <div class="friendship-link"> <a href="http://www.aygfsteel.com/" title="狠狠久久亚洲欧美专区_中文字幕亚洲综合久久202_国产精品亚洲第五区在线_日本免费网站视频">狠狠久久亚洲欧美专区_中文字幕亚洲综合久久202_国产精品亚洲第五区在线_日本免费网站视频</a> </div> </footer> Ö÷Õ¾Ö©Öë³ØÄ£°å£º <a href="http://" target="_blank">¸ßÌÆÏØ</a>| <a href="http://" target="_blank">¿¦À®Ç߯ì</a>| <a href="http://" target="_blank">ÓÀÌ©ÏØ</a>| <a href="http://" target="_blank">Í©Â®ÏØ</a>| <a href="http://" target="_blank">¼ÎÉÆÏØ</a>| <a href="http://" target="_blank">³¤·áÏØ</a>| <a href="http://" target="_blank">ξÀçÏØ</a>| <a href="http://" target="_blank">Ò×ÃÅÏØ</a>| <a href="http://" target="_blank">´ó°²ÊÐ</a>| <a href="http://" target="_blank">·áÌ¨Çø</a>| <a href="http://" target="_blank">ÇåÐÂÏØ</a>| <a href="http://" target="_blank">ôë½­ÏØ</a>| <a href="http://" target="_blank">×ÊÐËÊÐ</a>| <a href="http://" target="_blank">ºÍÌïÊÐ</a>| <a href="http://" target="_blank">Èð½ðÊÐ</a>| <a href="http://" target="_blank">äØÌ¶ÏØ</a>| <a href="http://" target="_blank">³ÇÊÐ</a>| <a href="http://" target="_blank">Îâ½­ÊÐ</a>| <a href="http://" target="_blank">ÖîôßÊÐ</a>| <a href="http://" target="_blank">ÐÞË®ÏØ</a>| <a href="http://" target="_blank">»áÍ¬ÏØ</a>| <a href="http://" target="_blank">Ö¦½­ÊÐ</a>| <a href="http://" target="_blank">¸ßÌÆÏØ</a>| <a href="http://" target="_blank">DZ½­ÊÐ</a>| <a href="http://" target="_blank">Ñô¸ßÏØ</a>| <a href="http://" target="_blank">ÇåË®ºÓÏØ</a>| <a href="http://" target="_blank">Ðí²ýÏØ</a>| <a href="http://" target="_blank">ÉÛ¶«ÏØ</a>| <a href="http://" target="_blank">î¡ÄþÏØ</a>| <a href="http://" target="_blank">ÔæÇ¿ÏØ</a>| <a href="http://" target="_blank">À¼ÖÝÊÐ</a>| <a href="http://" target="_blank">Æ½Ô­ÏØ</a>| <a href="http://" target="_blank">¼Î¶¨Çø</a>| <a href="http://" target="_blank">³¤ÖÎÊÐ</a>| <a href="http://" target="_blank">¶¨±ßÏØ</a>| <a href="http://" target="_blank">ÔÀÎ÷ÏØ</a>| <a href="http://" target="_blank">ÇÕÖÝÊÐ</a>| <a href="http://" target="_blank">°¢À­ÉÆÃË</a>| <a href="http://" target="_blank">ɳƺ°ÓÇø</a>| <a href="http://" target="_blank">аͶû»¢×óÆì</a>| <a href="http://" target="_blank">¸£¶¦ÊÐ</a>| <script> (function(){ var bp = document.createElement('script'); var curProtocol = window.location.protocol.split(':')[0]; if (curProtocol === 'https') { bp.src = 'https://zz.bdstatic.com/linksubmit/push.js'; } else { bp.src = 'http://push.zhanzhang.baidu.com/push.js'; } var s = document.getElementsByTagName("script")[0]; s.parentNode.insertBefore(bp, s); })(); </script> </body>