您的位置:首页 > 大数据 > 云计算

山东省枣庄市台儿庄区云平台运维故障处理一例

2018-01-12 08:14 363 查看
故障现象:

外网访问页面显示不正常,Template Error!

接到反馈后,远程到服务器,执行service nginx restart ,提示磁盘满,无法继续。

查看磁盘占用情况,磁盘/目录占用100%,磁盘空间满。

执行

find / -size +100M -exec ls -lh {} \;


查找100M以上的文件,发现主要大的是NGINX的日志:



一天的NGINX日志大到30多个G,太离谱了!!!

切割文件的最后N行到新的文件中

tail -n 1000  access.log >> 111.log


这样做,主要是太大的文件我用VI无法打开。

查看最新的日志:

{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}
{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}

对比正常的其它服务器: http://10.10.14.199/dsideal_yy/golbal/getValueByKey?key=common.rongyun.suffix {"common.rongyun.suffix":"_199"}

继续查看此服务器: http://10.24.0.7:7777/dsideal_yy/golbal/getValueByKey?key=common.rongyun.suffix {"common.rongyun.suffix":"_zztez"}

也不是404啊!!!!

访问一下http://10.24.0.7,居然可以访问的到!!!!!!还显示welcome to Nginx!!!!我们是7777的端口啊!!!

查看一下80端口是什么东东在使用:

yum install lsof
lsof -i tcp:80

[root@bogon conf]# lsof -i tcp:80
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
nginx 243175 root 10u IPv4 2358009951 0t0 TCP *:http (LISTEN)
nginx 243176 root 10u IPv4 2358009951 0t0 TCP *:http (LISTEN)
nginx 243178 root 10u IPv4 2358009951 0t0 TCP *:http (LISTEN)
nginx 243179 root 10u IPv4 2358009951 0t0 TCP *:http (LISTEN)
nginx 243180 root 10u IPv4 2358009951 0t0 TCP *:http (LISTEN)

根据pid查文件位置:
ps 243175

[root@bogon conf]# ps 243175
PID TTY STAT TIME COMMAND
243175 ? Ss 0:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx -c /usr/local/openresty/nginx/conf/nginx.conf

到这里很显示了,还是这个配置文件有问题,里面配置了两个端口:7777和80

server {
listen       80;
server_name  office.edusoa.com;

location ^~ /yjj/ {
proxy_pass http://221.194.113.150/; }

location ^~ /pingli/ {
proxy_pass http://61.134.47.35:9999/; }

}


这个东东明显就是垃圾,注释掉,并手工删除超大的日志文件,世界清静了~

回头想想,这是两个问题:

1、即然配置了7777,为什么还保留了80,这是明确的错误。

2、就算是同时配置了80和7777,那个大量的404错误从哪里来??

{"ip":"10.24.0.6","request_method":"GET","request_uri":"/dsideal_yy/golbal/getValueByKey","args_get":"key=common.rongyun.suf
fix","args_post":"-","browser":"Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0","cookie":"-","request_time"
:"12/Jan/2018:07:52:15 +0800","http_status":"404"}

很明显,是10.24.0.6这台机器上来的,这是一台windows主机,是处理程序发过来的请求,它在干什么?为什么不去访问7777的端口??

补充:

1、在访问时,发现访问不了,直接while true了,这里应该是停止或者休息一会。

2、处理程序在访问全局变量时居然没有加上端口!!!!!造成继续访问80,这就是几个错误集中在一起,导致了问题出现,血的教训!!!
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: