Low-level Thinking in High-level Shading Languages
2013-09-06 09:38
756 查看
因为要反汇编shader代码,所以google了数学函数_sat的知识,发现了一些高级着色语言的优化相关的问题。Low-level Thinking in High-level Shading Languages,地址如下 http://www.luluathena.com/?p=1633 大意是汇编写shader,代码需要自己来优化。一直都是写HLSL这样的高级语言,编译器自动优化,却发现书写的代码会深深影响所生成的汇编效果。依赖编译器优化,总是那么不靠谱,所以懂点儿优化的程序员还是有市场的。
摘个例子1:
(x + 1.0f) * 0.5f所生成的汇编指令是两句,先add,再mul,很直观。而x*0.5f + 0.5f所生成的汇编指令只有一句,mad。这就是PC上编译器不会帮我们做的事情。由于浮点数的运算有误差,有时候编译器在为代码进行优化时,会由于修改了运算的顺序而导致INF和NAN(且不论编译器,甚至我们自己都会导致这种错误)。
考虑 x = 0.2f 时:
sqrt(0.1f * (0.2f - x)) 返回0
sqrt(0.02f - 0.1f * x) 返回 NaN //由于0.02f - 0.1f*0.2f 返回了一个极小的负值,导致开方出错
例子2:
对应的汇编代码:
而这段代码的精简版本:
对应的汇编就只有一条。
专注做产品的程序员不应该考虑太多底层优化的东西,而对于引擎程序员的要求,不论GPU还是CPU编程,都应该Low-level Thinking in High-level Languages
摘个例子1:
(x + 1.0f) * 0.5f所生成的汇编指令是两句,先add,再mul,很直观。而x*0.5f + 0.5f所生成的汇编指令只有一句,mad。这就是PC上编译器不会帮我们做的事情。由于浮点数的运算有误差,有时候编译器在为代码进行优化时,会由于修改了运算的顺序而导致INF和NAN(且不论编译器,甚至我们自己都会导致这种错误)。
考虑 x = 0.2f 时:
sqrt(0.1f * (0.2f - x)) 返回0
sqrt(0.02f - 0.1f * x) 返回 NaN //由于0.02f - 0.1f*0.2f 返回了一个极小的负值,导致开方出错
例子2:
float AlphaThreshold(float alpha, float threshold, float blendRange) { float halfBlendRange = 0.5f*blendRange; threshold = threshold*(1.0f + blendRange) - halfBlendRange; float opacity = saturate( (alpha - threshold + halfBlendRange)/blendRange ); return opacity; }
对应的汇编代码:
mul r0.x, cb0[0].y, l(0.500000) add r0.y, cb0[0].y, l(1.000000) mad r0.x, cb0[0].x, r0.y, -r0.x add r0.x, -r0.x, v0.x mad r0.x, cb0[0].y, l(0.500000), r0.x div_sat o0.x, r0.x, cb0[0].y
而这段代码的精简版本:
// scale = 1.0f / blendRange // offset = 1.0f - (threshold/blendRange + threshold) float AlphaThreshold(float alpha, float scale, float offset) { return saturate( alpha * scale + offset ); }
对应的汇编就只有一条。
mad_sat o0.x, v0.x, cb0[0].x, cb0[0].y
专注做产品的程序员不应该考虑太多底层优化的东西,而对于引擎程序员的要求,不论GPU还是CPU编程,都应该Low-level Thinking in High-level Languages
相关文章推荐
- low level descriptors and high level descriptors
- u-boot2010.03 分析篇(二)-----lowlevel.init.S
- low level and hight level structure in CNN
- [Google]Find numbers of nodes in a BST in the range [low,high]
- 【论文笔记】What Value Do Explicit High Level Concept Have in Vision to Language Problems?
- comparing c++ and c# -- a perspective from high level languages such as C#
- TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks
- ###《High-level event recognition in unconstrained videos》
- High level GPU programming in C++
- HLSL(High level shading language)
- an oracle article in high level to descibe how to archtichre operator JAVA relevet project
- How To Use GSM Compression in Low-level Wave Recording.
- Unbuffered low-level IO and Buffered high-level IO
- Kafka High Level Consumer API in Scala
- Deflate Compression Algorithm Implemented in High Level Language? - Stack Overflow
- Thinking in Current Programming Languages
- Kafka High Level API vs. Low Level API
- Kafka:High level consumer vs. Low level consumer
- u-boot2010.03 分析篇(二)-----lowlevel.init.S
- PAT (Advanced Level) 1044. Shopping in Mars (25) 找符合要求的连续子数组,贪心