posts - 495,comments - 227,trackbacks - 0
          轉載自http://www.cnblogs.com/tovin/p/3974417.html

          本文主要介紹如何在Storm編程實現與Kafka的集成

            一、實現模型

             數據流程:

              1、Kafka Producter生成topic1主題的消息 

              2、Storm中有個Topology,包含了KafkaSpout、SenqueceBolt、KafkaBolt三個組件。其中KafkaSpout訂閱了topic1主題消息,然后發送

                給SenqueceBolt加工處理,最后數據由KafkaBolt生成topic2主題消息發送給Kafka

              3、Kafka Consumer負責消費topic2主題的消息

              

              

            二、Topology實現

              1、創建maven工程,配置pom.xml

                需要依賴storm-core、kafka_2.10、storm-kafka三個包

          復制代碼
            <dependencies>         <dependency>            <groupId>org.apache.storm</groupId>              <artifactId>storm-core</artifactId>              <version>0.9.2-incubating</version>              <scope>provided</scope>         </dependency>         <dependency>         <groupId>org.apache.kafka</groupId>         <artifactId>kafka_2.10</artifactId>         <version>0.8.1.1</version>         <exclusions>             <exclusion>                 <groupId>org.apache.zookeeper</groupId>                 <artifactId>zookeeper</artifactId>             </exclusion>             <exclusion>                 <groupId>log4j</groupId>                 <artifactId>log4j</artifactId>             </exclusion>         </exclusions>     </dependency>                 <dependency>             <groupId>org.apache.storm</groupId>            <artifactId>storm-kafka</artifactId>             <version>0.9.2-incubating</version>       </dependency>     </dependencies>      <build>     <plugins>       <plugin>         <artifactId>maven-assembly-plugin</artifactId>         <version>2.4</version>         <configuration>           <descriptorRefs>             <descriptorRef>jar-with-dependencies</descriptorRef>           </descriptorRefs>         </configuration>         <executions>           <execution>             <id>make-assembly</id>              <phase>package</phase>             <goals>               <goal>single</goal>             </goals>           </execution>         </executions>       </plugin>     </plugins>   </build>
          復制代碼

           

              2、KafkaSpout

                KafkaSpout是Storm中自帶的Spout,源碼在https://github.com/apache/incubator-storm/tree/master/external

                使用KafkaSpout時需要子集實現Scheme接口,它主要負責從消息流中解析出需要的數據

          復制代碼
          public class MessageScheme implements Scheme {           /* (non-Javadoc)      * @see backtype.storm.spout.Scheme#deserialize(byte[])      */     public List<Object> deserialize(byte[] ser) {         try {             String msg = new String(ser, "UTF-8");              return new Values(msg);         } catch (UnsupportedEncodingException e) {                     }         return null;     }               /* (non-Javadoc)      * @see backtype.storm.spout.Scheme#getOutputFields()      */     public Fields getOutputFields() {         // TODO Auto-generated method stub         return new Fields("msg");       }   } 
          復制代碼

              3、SenqueceBolt

                 SenqueceBolt實現很簡單,在接收的spout的消息前面加上“I‘m” 

          復制代碼
          public class SenqueceBolt extends BaseBasicBolt{          /* (non-Javadoc)      * @see backtype.storm.topology.IBasicBolt#execute(backtype.storm.tuple.Tuple, backtype.storm.topology.BasicOutputCollector)      */     public void execute(Tuple input, BasicOutputCollector collector) {         // TODO Auto-generated method stub          String word = (String) input.getValue(0);            String out = "I'm " + word +  "!";            System.out.println("out=" + out);          collector.emit(new Values(out));     }          /* (non-Javadoc)      * @see backtype.storm.topology.IComponent#declareOutputFields(backtype.storm.topology.OutputFieldsDeclarer)      */     public void declareOutputFields(OutputFieldsDeclarer declarer) {         declarer.declare(new Fields("message"));     } } 
          復制代碼

              4、KafkaBolt

                KafkaBolt是Storm中自帶的Bolt,負責向Kafka發送主題消息

              5、Topology

          復制代碼
          public class StormKafkaTopo {        public static void main(String[] args) throws Exception { 
               // 配置Zookeeper地址 BrokerHosts brokerHosts
          = new ZkHosts("node04:2181,node05:2181,node06:2181"); // 配置Kafka訂閱的Topic,以及zookeeper中數據節點目錄和名字 SpoutConfig spoutConfig = new SpoutConfig(brokerHosts, "topic1", "/zkkafkaspout" , "kafkaspout");
               // 配置KafkaBolt中的kafka.broker.properties Config conf
          = new Config(); Map<String, String> map = new HashMap<String, String>();
               // 配置Kafka broker地址 map.put(
          "metadata.broker.list", "node04:9092"); // serializer.class為消息的序列化類 map.put("serializer.class", "kafka.serializer.StringEncoder"); conf.put("kafka.broker.properties", map);
              // 配置KafkaBolt生成的topic conf.put(
          "topic", "topic2"); spoutConfig.scheme = new SchemeAsMultiScheme(new MessageScheme()); TopologyBuilder builder = new TopologyBuilder(); builder.setSpout("spout", new KafkaSpout(spoutConfig)); builder.setBolt("bolt", new SenqueceBolt()).shuffleGrouping("spout"); builder.setBolt("kafkabolt", new KafkaBolt<String, Integer>()).shuffleGrouping("bolt"); if (args != null && args.length > 0) { conf.setNumWorkers(3); StormSubmitter.submitTopology(args[0], conf, builder.createTopology()); } else { LocalCluster cluster = new LocalCluster(); cluster.submitTopology("Topo", conf, builder.createTopology()); Utils.sleep(100000); cluster.killTopology("Topo"); cluster.shutdown(); } } }
          復制代碼

           

                 

            三、測試驗證

              1、使用Kafka client模擬Kafka Producter ,生成topic1主題   

                bin/kafka-console-producer.sh --broker-list node04:9092 --topic topic1

              2、使用Kafka client模擬Kafka Consumer,訂閱topic2主題

                bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic topic2 --from-beginning

              3、運行Strom Topology

                bin/storm jar storm-kafka-0.0.1-SNAPSHOT-jar-with-dependencies.jar  StormKafkaTopo KafkaStorm

              4、運行結果

                  

                  

          posted on 2015-03-01 15:47 SIMONE 閱讀(4964) 評論(0)  編輯  收藏 所屬分類: hadoop
          主站蜘蛛池模板: 临汾市| 黑龙江省| 岳池县| 桦甸市| 筠连县| 友谊县| 黄梅县| 海丰县| 洛南县| 木里| 珲春市| 南雄市| 青浦区| 南康市| 霍邱县| 永安市| 克山县| 佛学| 交口县| 尖扎县| 宜良县| 阳朔县| 鹰潭市| 财经| 银川市| 神木县| 京山县| 休宁县| 呼伦贝尔市| 东安县| 诸暨市| 得荣县| 达拉特旗| 都安| 三明市| 泉州市| 忻城县| 额敏县| 柘荣县| 义乌市| 资溪县|