您的位置:首页 > 产品设计 > UI/UE

GPDB43 Administrator Guide--第九章 管理greenplum系统

2015-09-24 09:17 573 查看

第九章 管理greenplum系统

一、监控数据库活动与性能

使用工具greenplum command center ;

二、监控系统状态

(一)开启系统告警与通知

可以配置greenplum数据库系统触发snmp去告警并email通知系统管理员,当特定的数据库事件发生时;这些事件包括:

  • All PANIC-level error conditions
  • All FATAL-level error conditions
  • ERROR-level conditions that are "internal errors" (for example, SIGSEGV errors)
  • Database system shutdown and restart
  • Segment failure and recovery
  • Standby master out-of-sync conditions
  • Master host manual shutdown or other software problem (in certain failure scenarios, Greenplum Database cannot send an alert or notification)

1. SNMP配置

a.准备,确认OS是否安装snmp:/usr/sbin/snmpd、/etc/snmp/、snmpd.conf

b.安装之后:设置自启动# /sbin/chkconfig snmpd on、

c.测试运行情况

# snmpwalk -v 1 -c community_name localhost .1.3.6.1.2.1.1.1.0

c.设置snmp通知

i)master主机上配置参数,使用gpconfig工具

gp_snmp_community:snmp community名称

gp_snmp_monitor_address:hostname:port,多个地址用逗号隔开

gp_snmp_use_inform_or_trap:trap 、inform

例子:

$ gpconfig -c gp_snmp_community -v public --masteronly

$ gpconfig -c gp_snmp_monitor_address -v mdw:162 --masteronly

$ gpconfig -c gp_snmp_use_inform_or_trap -v trap --masteronly

ii)测试snmp通知

# /usr/sbin/snmptrapd -m ALL -Lf ~/filename.log

-Lf:trap写入日志文件

-Le:trap标准输出

-m all:加载所有可用的MIB(Management Information Bases)

 

2.开启EMAIL通知

a.打开$MASTER_DATA_DIRECTORY/postgresql.conf文件

b.修改EMAIL ALERTS段信息,例如:

gp_email_smtp_server='smtp.company.com:25'

gp_email_smtp_userid='gpadmin@company.com'

gp_email_smtp_password='mypassword'

gp_email_from='Greenplum Database <gpadmin@company.com>'

gp_email_to='dba@company.com;John Smith <jsmith@company.com>'

也可以使用外网,公用的SMTP服务器,例如gmail

gp_email_smtp_server='smtp.gmail.com:25'

#gp_email_smtp_userid=''

#gp_email_smtp_password=''

gp_email_from='gpadmin@company.com'

gp_email_to='test_account@gmail.com'

c.保存并关闭postgresql.conf

d.重载GP数据库postgresql.conf

$ gpstop -u

3.测试EMAIL通知

$ ping my_email_server

$ psql template1

=# SELECT gp_elog('Test GPDB Email',true); gp_elog

 

(二)检查系统状态

$ gpstate       #segment实例简洁信息

$ gpstate -s  #GP详细信息

$ gpstate -m #mirror信息

$ gpstate -c  #primary与mirror对应信息

$ gpstate -f  #standby master状态

 

(三)检查磁盘空间

·         查询磁盘空闲空间

=# SELECT * FROM gp_toolkit.gp_disk_free ORDER BY dfsegment;

·         查询数据库占用空间

=> SELECT * FROM gp_toolkit.gp_size_of_database ORDER BY soddatname;

·         查询表占用空间

=> SELECT relname AS name, sotdsize AS size, sotdtoastsize AS toast, sotdadditionalsize AS other

FROM gp_size_of_table_disk as sotd, pg_class

WHERE sotd.sotdoid=pg_class.oid ORDER BY relname;

·         查询索引占用空间

=> SELECT soisize, relname as indexname

FROM pg_class, gp_size_of_index

WHERE pg_class.oid=gp_size_of_index.soioid

AND pg_class.relkind='i';

 

(四)检查数据分布

1.查询表的分布键

=# \d+ sales

Table "retail.sales"

Column | Type | Modifiers | Description

-------------+--------------+-----------+-------------

sale_id    | integer       |               |

amt          | float            |               |

date         | date            |               |

Has OIDs: no

Distributed by: (sale_id)  

2.查询数据的分布情况

=# SELECT gp_segment_id, count(*)  FROM table_name GROUP BY gp_segment_id;

3.检查查询进程的倾斜

=# SELECT gp_segment_id, count(*) FROM table_name

WHERE column='value' GROUP BY gp_segment_id;

·         极度倾斜警告

hash连接查询时如果倾斜严重的话,则警告如下:

Extreme skew in the innerside of Hashjoin

参考如下步骤避免倾斜:

a.保证所有事实表已经分析过

b.验证所有需要用到的临时表也被分析

c.查看EXPLAIN ANALYZE的查询计划,并查找以下内容:

?如果多列过滤器扫描与估算值相比产生了更多的行,则设置在gp_selectivity_damping_factor服务器配置参数为2或更高,并重新测试查询。

?如果连接一个相对较小(小于5000行)的事实表时发生倾斜,则设置gp_segments_for_planner服务器配置参数为1,并重新测试查询。

d. 检查查询的过滤器是否应用在基表的分布键上,如果过滤器与分布键是相同的,则考虑不同的分布键重新分布基表数据

e.检查连接键的基数。如果他们有低基数,尝试用不同的连接列重写查询或附加过滤器,以减少行数。这些变化可能更改查询语义。

(五)查看数据库对象的元数据信息

·         查询最后一次执行的操作

=> SELECT schemaname as schema, objname as table, usename as role, actionname as action, subtype as type, statime as time 

FROM pg_stat_operations  WHERE objname='cust'; 

schema | table | role | action | type | time

--------+-------+------+---------+-------+--------------------------

sales | cust | main | CREATE | TABLE | 2010-02-09 18:10:07.867977-08

sales | cust | main | VACUUM | | 2010-02-10 13:32:39.068219-08

sales | cust | main | ANALYZE | | 2010-02-25 16:07:01.157168-08

(3 rows)

·         查询对象定义:

=> \d+ mytable

(六)查看session内存使用情况

·         创建session_level_memory_consumption视图

$ psql –d testdb –f $GPHOME/share/postgresql/contrib/gp_session_state.sql

 

·         session_level_memory_consumption视图介绍

is_runaway列指标是否是runaway session,而控制什么时候runaway的参数是runaway_detector_activation_percent。

column

type

column

datname

name

Name of the database that the session is connected to.

sess_id

integer

Session ID.

usename

name

Name of the session user.

current_query

text

Current SQL query that the session is running.

segid

integer

Current SQL query that the session is running.

vmem_mb

integer

Total vmem memory usage for the session in MB.

is_runaway

boolean

Session is marked as runaway on the segment.

qe_count

integer

Number of query processes for the session.

active_qe_count

integer

Number of active query processes for the session.

dirty_qe_count

integer

Number of query processes that have not yet released their memory. The value is -1 for sessions that are not running.

runaway_vmem_mb

integer

Amount of vmem memory that the session was consuming when it was marked as a runaway session.

runaway_command_cnt

integer

Command count for the session when it was marked as a runaway session.

 

(七)查看query workfile使用情况

视图信息用于指定参数:gp_workfile_limit_per_query和gp_workfile_limit_per_segment.。

schema gp_toolkit中相关视图:

gp_workfile_entries:

gp_workfile_usage_per_query:

gp_workfile_usage_per_segment:

 

三、查看数据库服务器日志文件

日常日志文件在pg_log目录,master和每一个segment主机中都有;日志格式通常是csv;

查找日志文件中内容:

$ gplogfilter -n 3

$ gpssh -f seg_host_file

=> source /usr/local/greenplum-db/greenplum_path.sh

=> gplogfilter -n 3 /gpdata/gp*/pg_log/gpdb*.log

 

四、使用gp_toolkit

=> ALTER ROLE myrole SET search_path TO myschema,gp_toolkit;


五、greenplum数据库的SNMP OID与ERROR code

(一).Greenplum Database SNMP OIDs 

This is the Greenplum Database OID hierarchy structure:

  • [li] iso(1) 
  • identified-organization(3)  [/li]
  • dod(6) 
  • internet(1) 
  • private(4) 
  • enterprises(1) 
  • gpdbMIB(31327) 
  • gpdbObjects(1) 
  • gpdbAlertMsg(1) 
  • gpdbAlertMsg 
  • 1.3.6.1.4.1.31327.1.1: STRING: alert message text

    • gpdbAlertSeverity  1.3.6.1.4.1.31327.1.2: INTEGER: severity level 

                    gpdbAlertSeverity can have one of the following values: 

     

    • [li] gpdbSevUnknown(0) 
    • gpdbSevOk(1)  [/li]
    • gpdbSevWarning(2) 
    • gpdbSevError(3) 
    • gpdbSevFatal(4) 
    • gpdbSevPanic(5) 
    • gpdbSevSystemDegraded(6) 
    • gpdbSevSystemDown(7) 
  • gpdbAlertSqlstate 
  • 1.3.6.1.4.1.31327.1.3: STRING: SQL standard error codes

            For a list of codes, see SQL Standard Error Codes.

    • gpdbAlertDetail 

    1.3.6.1.4.1.31327.1.4: STRING: detailed alert message text

    • gpdbAlertSqlStmt 

    1.3.6.1.4.1.31327.1.5: STRING: SQL statement generating this alert if  applicable 

    • gpdbAlertSystemName 

    1.3.6.1.4.1.31327.1.6: STRING: hostname


    (二)SQL标准ERROR code

    见下表:

     

    来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/16976507/viewspace-1807806/,如需转载,请注明出处,否则将追究法律责任。

    转载于:http://blog.itpub.net/16976507/viewspace-1807806/

    内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
    标签: