spark streaming 读取网络数据
2016-07-13 10:50
447 查看
package youling.studio.streaming import org.apache.spark.streaming.{Seconds,StreamingContext} import StreamingContext._ import org.apache.spark._ import org.apache.spark.SparkContext._ import org.apache.spark.storage.StorageLevel /** * Created by rolin on 16/7/11. */ object ReadSocket { def main (args: Array[String]){ if(args.length<2){ System.err.println("需要两个参数.") System.exit(1) } val Array(master,output) = args.take(2) System.out.println(master) System.out.println(output) val conf = new SparkConf().setMaster(master).setAppName("test socket streaming!") val ssc = new StreamingContext(conf,Seconds(30)) val lines = ssc.socketTextStream("127.0.0.1",7777,StorageLevel.MEMORY_AND_DISK_SER) val words = lines.flatMap(line => line.split(" ")) val wc = words.map((_,1)).reduceByKey((x,y)=> x+y) //wc.saveAsTextFile(output) wc.print println("start streaming") ssc.start() ssc.awaitTermination() println("done") } }
启动网络端发送端:
nc -lk 7777
启动spark 本地模式:
bin/spark-submit --class youling.studio.streaming.ReadSocket /Users/rolin/IdeaProjects/spark-test/target/sparktest-1.0-SNAPSHOT.jar local[4] ./
相关文章推荐
- Spark RDD API详解(一) Map和Reduce
- 使用spark和spark mllib进行股票预测
- Spark随谈——开发指南(译)
- Spark,一种快速数据分析替代方案
- eclipse 开发 spark Streaming wordCount
- Understanding Spark Caching
- ClassNotFoundException:scala.PreDef$
- Windows 下Spark 快速搭建Spark源码阅读环境
- Spark中将对象序列化存储到hdfs
- 使用java代码提交Spark的hive sql任务,run as java application
- Spark机器学习(一) -- Machine Learning Library (MLlib)
- Spark机器学习(二) 局部向量 Local-- Data Types - MLlib
- Spark机器学习(三) Labeled point-- Data Types
- Spark初探
- Spark Streaming初探
- Spark本地开发环境搭建
- 搭建hadoop/spark集群环境
- Spark HA部署方案
- 直播|易观CTO郭炜:精益化数据分析——如何让你的企业具有BAT一样的分析能力