Presto日志中出现大量的Triggering GC to avoid Code Cache eviction bugs
2017-08-02 21:14
405 查看
问题描述:
Presto日志中出现大量的2017-07-31T15:31:21.505+0800 INFO Code-Cache-GC-Trigger com.facebook.presto.server.CodeCacheGcTrigger Triggering GC to avoid Code Cache eviction bugs
Presto版本为0.170。
排查过程:
1. 检查Presto源码
出现该条日志的代码为// Hack to work around bugs in java 8 (8u45+) related to code cache management. // See http://openjdk.5641.n7.nabble.com/JIT-stops-compiling-after-a-while-java-8u45-td259603.html for more info. MemoryPoolMXBean codeCacheMbean = findCodeCacheMBean(); Thread gcThread = new Thread(() -> { while (!Thread.currentThread().isInterrupted()) { long used = codeCacheMbean.getUsage().getUsed(); long max = codeCacheMbean.getUsage().getMax(); if (used > 0.95 * max) { log.error("Code Cache is more than 95% full. JIT may stop working."); } if (used > (max * collectionThreshold) / 100) { // Due to some obscure bug in hotspot (java 8), once the code cache fills up the JIT stops compiling // By forcing a GC, we let the code cache evictor make room before the cache fills up. log.info("Triggering GC to avoid Code Cache eviction bugs"); System.gc(); } try { TimeUnit.MILLISECONDS.sleep(interval.toMillis()); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } } });
由代码可知,Presto会启一个后台线程,每隔一定时间(默认20s)会检查一次codecache的使用率,当使用率大于一定的值时,会打印该日志,并显式调用System.gc()。
而该类的作用,注释也说的很清楚了,即用于绕过java 8 (8u45+)中关于code cache管理的bug(一旦code cache满了,JIT就停止编译了)。通过强制触发一次GC,来腾出空间,避免code cache填满。
我们知道System.gc()用于建议JVM进行Full GC。然而通过jstat观察发现,实际情况Minor GC的频率很高,但是Major GC的次数为0。
2. 查阅资料
(1) https://groups.google.com/forum/#!topic/presto-users/inF0oLvOfqo上文中作者最终修改CodeCacheSize为600M、code-cache-collection-threshold为60,情况好转。他们的code cache一般在100M到230M, 不会超过配置的值: 600* 0.6 = 360M。
(2) https://news.ycombinator.com/item?id=12505517
上文中说到Presto is a SQL query engine that generates code for each query (a SQL query is effectively a program), so it can need a lot of codecache depending on the query rate and concurrency.
也就是presto会产生大量的类,也就需要jvm进行定期清理code cache。
由于code cache在方法区,只有Major GC才能够清理code cache。
3. 解决办法
检查当前配置
通过以下2种方法都可以查询当前配置的code cache初始值与最大值。默认情况下,初始值为2.4MB,最大值为240MB。java -XX:+PrintFlagsFinal -version -server | grep ReservedCodeCacheSize java -XX:+PrintCodeCache -version
而当前使用的量,就得通过jmx查询了。好在presto自身提供了对jmx的查询支持。
打开presto,执行:
use jmx.current; select * from "java.lang:type=memorypool,name=code cache";
修改配置
由于使用量一般在120MB左右,所以我设置CodeCacheSize为300M,code-cache-collection-threshold为60。300*0.6=180MB,满足要求。(1)在config.properties文件添加
code-cache-collection-threshold=60。
(2)在jvm.config添加
-XX:ReservedCodeCacheSize=300M。
参考资料
http://openjdk.5641.n7.nabble.com/JIT-stops-compiling-after-a-while-java-8u45-td259603.htmlhttps://groups.google.com/forum/#!topic/presto-users/inF0oLvOfqo
https://news.ycombinator.com/item?id=12505517
相关文章推荐
- Mac上打开android项目出现的Error:Failed to open zip file. Gradle's dependency cache may be corrupt
- linux zookeeper 不能启动,查看 输出日志 Failed to process transaction type: 1 error: KeeperErrorCode = NoNode
- 网站出现Can not write to cache files解决方法
- windows2008 安全日志出现大量帐号登录失败的解决办法
- 5 reasons to avoid code comments
- Openstack 出现"Failed to connect to server (code: 1006)“总结
- Jenkins连接git时出现“Failed to connect to repository : Command ... HEAD" returned status code 128:”的问题解决
- 解决android studio 编译新项目出现的Error:Failed to open zip file. Gradle's dependency cache may be corrupt 的错误
- Linux服务器nginx访问日志里出现大量400错误分析
- Hadoop出现错误:WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable,解决方案
- 真机调试出现: linker command failed with exit code 1 (use -v to see invocation)
- was系统错误日志大量出现标识符缺失
- linux zookeeper 不能启动,查看 输出日志 Failed to process transaction type: 1 error: KeeperErrorCode = NoNode
- swifty引用AFNetworking出现clang: error: linker command failed with exit code 1 (use -v to see invocation
- [TypeScript] Use the never type to avoid code with dead ends using TypeScript
- hadoop2.5.2出现 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… us
- 解决打包时出现的Failed to verify bitcode
- tomcat 运行jenkins启动时日志警告org.apache.catalina.webresources.Cache.getResource Unable to add the resource
- 【故障处理】mysql出现大量slave bin日志,将磁盘空间占满
- Ubuntu16.04下面的vs code出现Unable to activate CppCheck analyzer