股票数据自动入库hive到oracle
2016-06-24 18:36
369 查看
------import_stock_d.py-----------------------------------
#!/usr/bin/python
import tushare as ts
import os
import re
stocklistpath = '/home/cloudera/data/list/stocklist.txt'
savepath='/home/cloudera/data/data/'
openstock = open(stocklistpath,'r+')
for i in openstock:
r = r"S[ZH]\d\d\d\d\d\d"
stocklist = re.findall(r,i)
for i in stocklist:
stocknum = i[2:8]
df = ts.get_hist_data(stocknum)
df.to_csv(savepath + i +'.txt')
print i
for path,d,filelist in os.walk('/home/cloudera/data/data/'):
for filename in filelist:
filepath = os.path.join(path,filename)
print filepath
file = open(filepath,'r+')
file.seek(0,0)
filename1 = filename[0:8]+','
print filename1
for line in file.readlines():
print file.writelines(filename1 + line)
file.close()
--------------------------------StockRun.sh------------------------------------------------------
python /home/cloudera/python/import_stock_d.py
hadoop fs -put /home/cloudera/data/data /stock
hive -e "LOAD DATA INPATH '/stock/data/*' OVERWRITE INTO TABLE import_stock_d";
hive -e "insert overwrite table import_stock_d select * from import_stock_d where turnover is not null"
sqoop export --table import_stock_d -connect jdbc:oracle:thin:@192.168.1.10:1521:orcl --username stock --password stock --export-dir '/user/hive/warehouse/import_stock_d/*' --input-fields-terminated-by ',' --input-lines-terminated-by '\n' --columns
'code,T_DATE,OPEN,HIGH,CLOSE,LOW,VOLUME,PRICE_CHANGE,P_CHANGE,MA5,MA10,MA20,V_MA5,V_MA10,V_MA20,TURNOVER'
#!/usr/bin/python
import tushare as ts
import os
import re
stocklistpath = '/home/cloudera/data/list/stocklist.txt'
savepath='/home/cloudera/data/data/'
openstock = open(stocklistpath,'r+')
for i in openstock:
r = r"S[ZH]\d\d\d\d\d\d"
stocklist = re.findall(r,i)
for i in stocklist:
stocknum = i[2:8]
df = ts.get_hist_data(stocknum)
df.to_csv(savepath + i +'.txt')
print i
for path,d,filelist in os.walk('/home/cloudera/data/data/'):
for filename in filelist:
filepath = os.path.join(path,filename)
print filepath
file = open(filepath,'r+')
file.seek(0,0)
filename1 = filename[0:8]+','
print filename1
for line in file.readlines():
print file.writelines(filename1 + line)
file.close()
--------------------------------StockRun.sh------------------------------------------------------
python /home/cloudera/python/import_stock_d.py
hadoop fs -put /home/cloudera/data/data /stock
hive -e "LOAD DATA INPATH '/stock/data/*' OVERWRITE INTO TABLE import_stock_d";
hive -e "insert overwrite table import_stock_d select * from import_stock_d where turnover is not null"
sqoop export --table import_stock_d -connect jdbc:oracle:thin:@192.168.1.10:1521:orcl --username stock --password stock --export-dir '/user/hive/warehouse/import_stock_d/*' --input-fields-terminated-by ',' --input-lines-terminated-by '\n' --columns
'code,T_DATE,OPEN,HIGH,CLOSE,LOW,VOLUME,PRICE_CHANGE,P_CHANGE,MA5,MA10,MA20,V_MA5,V_MA10,V_MA20,TURNOVER'
相关文章推荐
- 最全的Oracle-SQL笔记(3)
- mysql与oracle区别
- ORACLE 11G 备库传备库级联传递(cascade dg) 的配置方法
- Oracle创建存储过程、执行存储过程基本语法
- 监控oracle\mysql\tuxedo\java中间件\ping丢包率\url连接
- oracle 循环分区处理
- oracle树查询
- ORACLE常用系统表大全
- oracle开启numa的支持
- Oracle创建表(包含、主键自增)
- linux上使用wget下载oracle jdk
- ORACLE动态游标及动态SQL使用实例
- oracle 平时记录
- Linux php-oracle扩展安装
- oracle分析函数over partition by 和group by的区别
- oracle乱码
- oracle乱码解决
- Oracle如何查看日志
- oracle 域索引创建及维护
- oracle分页查询语句