spark core 2.0 Partition and HadoopPartition
2017-02-17 16:05
344 查看
Spark Partition is a trait.
/**
* An identifier for a partition in an RDD.
*/
trait Partition extends Serializable {
/**
* Get the partition's index within its parent RDD
*/
def index: Int
// A better default implementation of HashCode
override def hashCode(): Int = index
override def equals(other: Any): Boolean = super.equals(other)
}
/**
* An identifier for a partition in an RDD.
*/
trait Partition extends Serializable {
/**
* Get the partition's index within its parent RDD
*/
def index: Int
// A better default implementation of HashCode
override def hashCode(): Int = index
override def equals(other: Any): Boolean = super.equals(other)
}
/** * A Spark split class that wraps around a Hadoop InputSplit. */ private[spark] class HadoopPartition(rddId: Int, override val index: Int, s: InputSplit) extends Partition { val inputSplit = new SerializableWritable[InputSplit](s) override def hashCode(): Int = 31 * (31 + rddId) + index override def equals(other: Any): Boolean = super.equals(other) /** * Get any environment variables that should be added to the users environment when running pipes * @return a Map with the environment variables and corresponding values, it could be empty */ def getPipeEnvVars(): Map[String, String] = { val envVars: Map[String, String] = if (inputSplit.value.isInstanceOf[FileSplit]) { val is: FileSplit = inputSplit.value.asInstanceOf[FileSplit] // map_input_file is deprecated in favor of mapreduce_map_input_file but set both // since it's not removed yet Map("map_input_file" -> is.getPath().toString(), "mapreduce_map_input_file" -> is.getPath().toString()) } else { Map() } envVars } }
相关文章推荐
- spark core 2.0 PartitionCoalescer, PartitionGroup, DefaultPartitionCoalescer
- Pro VB 2005 and the .NET 2.0 Platform, Second Edition
- Hadoop 2.0 计数器
- hadoop2.0 federation的配置
- the differences of DataRelation class between 1.1 and 2.0
- Authorization and Authentication In Hadoop
- hadoop2.0 capacity调度器配置
- hadoop 2.0 详细配置教程
- Deploy and Use the Splunk App for HadoopOps
- swapniess and overcommit and other hadoop optimization
- CentOS7 基于Hadoop2.7 的Spark2.0集群搭建
- Hive2.0 在 Hadoop2.7部署 (2017.03添加异常处理)(图文解说)
- spark core 2.0 Executor Heartbeat
- hadoop 2.0--YARN
- hadoop1.0 和hadoop2.0 任务处理架构比较
- 在Windows上构建和安装Hadoop 2.0或更新的版本
- hadoop 2.0 参数调优小结-part1
- Hadoop 2.0 – HA功能中ZKFC对NN状态的控制
- Java 开发 2.0: 用 Hadoop MapReduce 进行大数据分析
- How to install and use OpenCV 2.0 in Visual Studio2008(VS2008)