您的位置:首页 > 数据库 > MySQL

Hive汇总统计数据自动化传输到Mysql数据库-跑批参数文本配置及提取使用

2017-06-13 17:23 471 查看
参照:Hive汇总统计数据自动化传输到Mysql数据库--->http://blog.csdn.net/babyfish13/article/details/72701512
本文是对【Hive汇总统计数据自动化传输到Mysql数据库】一文的补充。
1、参数文件配置及示例
/Users/nisj/Desktop/honey_report/BatchParConfig.txt
# BatchParConfig file
----------------------
# 测试表1
ConfigName = 'user_profile'
srcSelectText = """select uid,regexp_replace(nickname,'\t','') nickname,avatar,gender,source,state from xxx_user_profile limit 10000;"""
mysqlConfig ='targetMysqlConfig_funnyai_data'
targetTabName = 'xxxxxx_test_0523'
targetTabNameDesc = '测试用表'
Config[user_profile] is finished!

# 测试表2
ConfigName = 'payinfo'
srcSelectText = """select a1.nickname,a1.room_id,a1.uid,a1.pt_day,count(today_add) today_subscriber, sum(today_add) today_subscriber_new_user,sum(case when a1.today_add=1 then a2.message_cnt else null end) message_send_in_thisRoom,sum(case when a1.today_add=1 then amount else 0 end) pay_amount <-->
from xx0522_newadd_user a1 <-->
left join (select roomid,uid,pt_day,count(*) message_cnt from oss_chushou_message_send where pt_day between '2017-05-15' and '2017-05-21' group by roomid,uid,pt_day) a2 on a1.room_id=int(a2.roomid) and a1.view_uid=int(a2.uid) and a1.pt_day=a2.pt_day <-->
left join (select uid,pt_day,sum(amount) amount from oss_pay_info where pt_day between '2017-05-15' and '2017-05-21' group by uid,pt_day) a3 on a1.view_uid=int(a3.uid) and a1.pt_day=a3.pt_day <-->
group by a1.nickname,a1.room_id,a1.uid,a1.pt_day;""" <-->
mysqlConfig = 'targetMysqlConfig_funnyai_data'
targetTabName = 'xxxxxx_test_0608'
targetTabNameDesc = '测试用表2'
Config[payinfo] is finished!

2、配置参数的提取和使用
/Users/nisj/Desktop/honey_report/BatchParGet.py
#!/usr/bin/env python
# encoding: utf-8

def ParGet(ConfigName):
with open("BatchParConfig.txt") as ConfigFile:
ConfigFileList = []
for ConfigLine in ConfigFile:
ConfigFileList.append(ConfigLine.replace('\n', ''))

ConfigFileList_indexStart = ConfigFileList.index('ConfigName = \'{ConfigName}\''.format(ConfigName=ConfigName))
ConfigFileList_indexEnd = ConfigFileList.index(
'Config[{ConfigName}] is finished!'.format(ConfigName=ConfigName))

# print ConfigFileList_indexStart, ConfigFileList_indexEnd
# print ConfigFileList[ConfigFileList_indexStart:ConfigFileList_indexEnd]

srcSelectText = mysqlConfig = targetTabName = targetTabNameDesc = ''
srcSelectCommand = ''
for ConfigFileList_index in range(ConfigFileList_indexStart, ConfigFileList_indexEnd):
if ' <-->' in ConfigFileList[ConfigFileList_index]:
srcSelectCommand = srcSelectCommand + str(ConfigFileList[ConfigFileList_index]).replace(' <-->', ' ')
else:
exec (ConfigFileList[ConfigFileList_index])
exec (srcSelectCommand)
return srcSelectText, mysqlConfig, targetTabName, targetTabNameDesc

print ParGet(ConfigName='payinfo')

3、有关说明
这里的配置文件每一个配置以【ConfigName = '{ConfigName}'】开始,以【Config[{ConfigName}] is finished!】结尾,取配置文件的Python脚本只取两者之间的内容。

对于srcSelectText的sql多行的情况,需要在每一个的行尾用【 <-->】以标识。

此处,每一个配置主要有四项内容,srcSelectText, mysqlConfig, targetTabName, targetTabNameDesc;后期可根据需要增改。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐