您的位置:首页 > 运维架构

opscenter dashboard排错

2015-08-10 10:03 375 查看
系统环境

opscenter 5.2

centOS 6.6

cassandra 2.0.x

问题

opscenter上的dashboard监控cassandra集群一段时间(大约1天)后总会停止显示。

然而在cassandra节点上发现datastax-agent进程还是好好的在运行着。

之后查看datastax agent的LOG日志发现

WARN [Thread-10] .... operations dropped so far.
WARN [Thread-10] .... Cassandra operation queue is full, discarding cassandra operation

Error when proccessing cassandra callcom.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /192.168.47.222:9042 (com.datastax.driver.core.TransportException: [/192.168.47.222:9042] Connection has been closed))

ERROR [Reconnection-0] 2015-08-05 16:06:39,841 Unknown error during reconnection to /192.168.47.222:9042, scheduling retry in 8000 milliseconds


初步认定是cassandra request过多导致

解决方案

/var/lib/datastax-agent/conf/address.yaml
中添加参数

stomp_interface: opscenterIP
use_ssl: 0
async_pool_size: 200
thrift_max_cons: 200
async_queue_size: 20000
hosts: 集群ip,格式为["host1","host2"]
local_interface: localhost
cassandra_conf: /xxx/apache-cassandra-2.0.15/conf/cassandra.yaml


$CASSANDRA_HOME/conf/clusters/cluster_name.conf
中修改

[stomp]
batch_size = 10000
push_interval = 10


一些参数

#address.yaml参数
thrift_max_conns - the max number of concurrent connections to make to the local node

asysnc_pool_size - the size of the threadpool pulling from a queue of inserts and inserting in to cassandra

async_queue_size - the size of the queue of inserts to send to cassandra, if the queue fills up additional operations will be dropped

#stomp参数
batch_size - The number of request updates OpsCenter will push out at once. The default value is 100. This is used to avoid overloading the browser.

push_interval - How often OpsCenter will push out updates to requests. The default value is 3 seconds. This is used to avoid overloading the browser


done.

opscenter配置官方文档
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: