Cloudera认证_开发者考试大纲
2017-05-16 16:11
316 查看
cca175开发者认证有10至12条题目,主要是基于cdh5的操作题。
要想通过此考试,需要掌握以下基本技能:
1.获取数据的能力
这需要我们掌握sqoop的etl命令,flume的数据采集方式,以及如何使用hdfs的命令行加载数据。
2.数据规划,传输,存储的能力
使用spark读取hdfs上的数据,并进行一些基本的处理操作,将结果再返回到hdfs上,使用spark,我们不仅需要会用scala语言,也需要会python语言。
读取文件数据,对数据进行计算等等。
3.数据分析的能力
使用ddl语言对数据进行建表操作。建表的内容包括了建内部表,外部表,分区表,指定存储格式,指定分隔符,基于schema文件建表等。
这些都是非常基础的内容。多练即可。
CCA Spark and Hadoop Developer Exam (CCA175)
Number of Questions: 10–12 performance-based
(hands-on) tasks on CDH5 cluster. See below for full cluster
configuration
Time Limit: 120 minutes
Passing Score: 70%
Language: English, Japanese
(forthcoming)
Required Skills
Data Ingest
The skills to transfer data between external systems and your
cluster. This includes the following:
· Import
data from a MySQL database into HDFS using Sqoop
· Export
data to a MySQL database from HDFS using Sqoop
· Change
the delimiter and file format of data during import using
Sqoop
· Ingest
real-time and near-real time (NRT) streaming data into HDFS using
Flume
· Load
data into and out of HDFS using the Hadoop File System (FS)
commands
Transform, Stage, Store
Convert a set of data values in a given format stored in HDFS into
new data values and/or a new data format and write them into HDFS.
This includes writing Spark applications in both Scala and
Python:
· Load
data from HDFS and store results back to HDFS using
Spark
· Join
disparate datasets together using Spark
· Calculate
aggregate statistics (e.g., average or sum) using Spark
· Filter
data into a smaller dataset using Spark
· Write
a query that produces ranked or sorted data using Spark
Data Analysis
Use Data Definition Language (DDL) to create tables in the Hive
metastore for use by Hive and Impala.
· Read
and/or create a table in the Hive metastore in a given
schema
· Extract
an Avro schema from a set of datafiles using avro-tools
· Create
a table in the Hive metastore using the Avro file format and an
external schema file
· Improve
query performance by creating partitioned tables in the Hive
metastore
· Evolve
an Avro schema by changing JSON files
要想通过此考试,需要掌握以下基本技能:
1.获取数据的能力
这需要我们掌握sqoop的etl命令,flume的数据采集方式,以及如何使用hdfs的命令行加载数据。
2.数据规划,传输,存储的能力
使用spark读取hdfs上的数据,并进行一些基本的处理操作,将结果再返回到hdfs上,使用spark,我们不仅需要会用scala语言,也需要会python语言。
读取文件数据,对数据进行计算等等。
3.数据分析的能力
使用ddl语言对数据进行建表操作。建表的内容包括了建内部表,外部表,分区表,指定存储格式,指定分隔符,基于schema文件建表等。
这些都是非常基础的内容。多练即可。
CCA Spark and Hadoop Developer Exam (CCA175)
Number of Questions: 10–12 performance-based
(hands-on) tasks on CDH5 cluster. See below for full cluster
configuration
Time Limit: 120 minutes
Passing Score: 70%
Language: English, Japanese
(forthcoming)
Required Skills
Data Ingest
The skills to transfer data between external systems and your
cluster. This includes the following:
· Import
data from a MySQL database into HDFS using Sqoop
· Export
data to a MySQL database from HDFS using Sqoop
· Change
the delimiter and file format of data during import using
Sqoop
· Ingest
real-time and near-real time (NRT) streaming data into HDFS using
Flume
· Load
data into and out of HDFS using the Hadoop File System (FS)
commands
Transform, Stage, Store
Convert a set of data values in a given format stored in HDFS into
new data values and/or a new data format and write them into HDFS.
This includes writing Spark applications in both Scala and
Python:
· Load
data from HDFS and store results back to HDFS using
Spark
· Join
disparate datasets together using Spark
· Calculate
aggregate statistics (e.g., average or sum) using Spark
· Filter
data into a smaller dataset using Spark
· Write
a query that produces ranked or sorted data using Spark
Data Analysis
Use Data Definition Language (DDL) to create tables in the Hive
metastore for use by Hive and Impala.
· Read
and/or create a table in the Hive metastore in a given
schema
· Extract
an Avro schema from a set of datafiles using avro-tools
· Create
a table in the Hive metastore using the Avro file format and an
external schema file
· Improve
query performance by creating partitioned tables in the Hive
metastore
· Evolve
an Avro schema by changing JSON files
相关文章推荐
- 2016年银行业专业人员初级资格考试《风险管理》考试大纲
- C语言考试大纲(参考)
- OCM10g考试大纲
- 全国计算机等级考试二级C++考试大纲
- 【工作笔记003】台湾地区交通工程技师考试大纲
- 外国法制史》考试大纲
- 2007年西医综合大纲考试形式和试卷结构
- Adobe RIA 开发工程师认证考试大纲
- HDU 2246 考研路茫茫——考试大纲
- 2007年河北省高校计算机一级考试大纲及心得
- 2011 年考试大纲
- 硬件工程师考试大纲
- 计算机组成原理考试大纲
- 全国计算机等级考试三级网络技术考试大纲
- 工程硕士专业学位入学考试考试大纲
- 二级MySQL数据库程序设计考试大纲(2015年版)
- 单片机工程师资格认证考试大纲
- 系统架构师考试需求大纲
- 2016年银行业专业人员初级资格考试《法律法规与综合能力》考试大纲, 2016年银行业中级资格考试《法律法规与综合能力》考试大纲
- 嵌入式Linux工程师认证考试大纲