您的位置：首页 > 大数据 > Hadoop

spark load file的几种方式

2016-01-27 20:24 357 查看

spark load file的几种方式：

1、直接导入localfile，而不是HDFS

sc.textFile("file:///path to the file/")

如sc.textFile("file:///home/spark/Desktop/README.md")

注意：

当设置了HADOOP_CONF_DIR的时候，即配置了集群环境的时候，如果直接sc.textFile("path/README.md")

路径会自动变成： hdfs://master:9000/user/spark/README.md
这个时候如果HDFS中没有，就会说，input path does not exist

2、给hdfs 的路径也可以

相关内容：

1、

Spark Quick Start - call to open README.md needs explicit fs prefix

Good catch; the Spark cluster on EC2 is configured to use HDFS as its default filesystem, so

it can’t find this file. The quick start was written to run on a single machine with an

out-of-the-box install. If you’d like to upload this file to the HDFS cluster on EC2, use

the following command:

2、

This has been discussed into spark mailing list, and please refer this mail.

You should use hadoop fs -put <localsrc> ... <dst> copy the file into hdfs:

${HADOOP_COMMON_HOME}/bin/hadoop fs -put /path/to/README.md README.md

于是我 /bin/hadoop -fs -put /home/spark/Desktop/README.md README.md
但这种方法怎么试都不行，说no such file or directory，还在研究

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： hdfs spark

相关文章推荐

新的分享

章节导航