您的位置：首页 > 运维架构

sqoop的基本用法介绍

2014-11-13 19:15 309 查看

注意：下面的用法都以mysql为例

RDBMS数据导入到hive

sqoop import --connect jdbc:mysql://172.17.210.180/dc_scheduler_client --username dc_scheduler_cli --password dc_scheduler_cli --table t_class --split-by id -m 2 --verbose --hive-import --create-hive-table --hive-table dc_test.t_class1 --<span style="font-family: Arial, Helvetica, sans-serif;">fields-terminated-by</span> '\t' --bindir /root/tmp --outdir /root/tmp <span style="font-family: Arial, Helvetica, sans-serif;">--null-string '\\N' --null-non-string '\\N'</span>

import：导入
connect：jdbc串
username：mysql的用户名
password：mysql的密码
table：mysql中的源表
split-by：按字段分割map，结合参数m进行使用
verbose：打印详细日志
hive-import：导入数据至hive
hive-create-table：根据原表导入hive表结构，当表已存在时会报错
fileds-terminated-by：hive中的表数据字段分隔符
bindir：存放sqoop产生的java代码对于的class文件及jar包
outdir：存放sqoop生产的java代码
null-string：源表数据字段为字符且为空时，用指定字符代替
null-non-string：源表数据字段不为字符且为空时，用指定字符代替

hive中的数据导出至RDBMS

sqoop export -export-dir /hive/warehouse/dc_test.db/t_class1  --connect jdbc:mysql://172.17.210.180/dc_scheduler_client --username dc_scheduler_cli --password dc_scheduler_cli --update-key id --update-mode allowinsert --table t_class --input-fields-terminated-by '\t' -m 1 --bindir /root/tmp --outdir /root/tmp --input-null-string '\\N' --input-null-non-string '\\N'

export：hive中导出数据
update-key：更新时依据的字段
update-mode：更新模式(updateonly：只更新 allowinsert：没有更新的情况，将数据插入)
table：目的端的表
input-fields-terminated-by：hive中的数据字段分隔符
input-null-string：当输出的字段为字符串并且为空时，用指定的字符替换
input-null-non-string：当输出的字段不是字符串且为空时，用指定的字符替换

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： hive hadoop sqoop

相关文章推荐

新的分享

章节导航