快速求平方根的算法
2007-07-31 13:18
204 查看
在Vector/Vertex/Matrix类的设计中常用到单位化一个向量或矩阵。需要用到开平方函数sqrt().但math.h中的sqrt()函数最非最佳的选择,事实上,存在很多求平方根较好的算法。
The article was originally created by Tomer Margolin and was originally published at www.CodeMaestro.com"
Any 3D engine draws it's power and speed from the mathematical models and implementations within, and trust John Carmack of ID software for using really good hacks. As it turns out, a very interesting hack is used in Quake III to calculate an inverse square root.
Preface
ID software has recently released the source code of Quake III engine with a GPL license. In this article we'll see Carmack work his black magic to calculate the square root of a floating point number blazingly fast.
Carmack's Unusual Inverse Square Root
A fast glance at the file game/code/q_math.c reveals many interesting performance hacks. The first thing that pops out is the use of the number 0x5f3759df in the function Q_rsqrt, which calculates the inverse square root of a float and why does this function actually work?
Observe the original function from q_math.c:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
#ifndef Q3_VM
#ifdef __linux__
assert( !isnan(y) ); // bk010122 - FPE?
#endif
#endif
return y;}
Not only does it work, on some CPU's Carmack's Q_rsqrt runs up to 4 times faster than (float)(1.0/sqrt(x), eventhough sqrt() is usually implemented using the FSQRT assembley instruction!
In another file, code/common/cm_trace.c , a neater implementation of the same hack can be found. This time it is used for calculating the square root of a float - sqrt(x). Note that the only real difference is in the return value – instead of returning y, return number*y as the square root:
/*
================
SquareRootFloat
================
*/
float SquareRootFloat(float number) {
long i;
float x, y;
const float f = 1.5F;
x = number * 0.5F;
y = number;
i = * ( long * ) &y;
i = 0x5f3759df - ( i >> 1 );
y = * ( float * ) &i;
y = y * ( f - ( x * y * y ) );
y = y * ( f - ( x * y * y ) );
return number * y;
}
Newton's Approximation of Roots
The code above implements the well known Newton Approximation of roots [3]. As in most other iterative approximation calculations, The Newton approximation is supposed to be ran in iterations; each iteration enhances the accuracy until enough iterations have been made for reaching the desired accuracy.
The general idea behind
Newton's approximation is that whenever we have a guess y for the value of the square root of a number x, we can perform a simple manipulation to get a better guess (one closer to the actual square root) by averaging y with x/y. For example we can compute the square root of 2 as follows; Suppose the initial guess is 1:
2/1 = 2 ; (2 + 1) / 2 = 1.5
2/1.5 = 1.3333; ( 1.5 + 1.3333 ) / 2 = 1.4167
2/1.4167 = 1.4117; ( 1.4167 + 1.4117 ) / 2 = 1.4142
And so on...
As mentioned before,
Newton's approximation is a well known method of fast root calculations. However, Carmack picked an unusual first guess for the first iteration that enhances the approximation drastically and lowers the number of required iterations in Quake III's calculations to only one iteration!
A Witchcraft Number
The really interesting aspect of this function is the magic constant 0x5f3759df, used to calculate the initial guess, in :
i = 0x5f3759df - ( i >> 1 );
Hence, dividing the input by 2 and subtracting it from the magic constant. This constant works almost perfectly - Only one
Newton approximation iteration is required for a low relative error of 10^-3. . As the code illustrates in the commented out second iteration, this approximation is good enough for the Quake III engine.
It turns out that the magic constant 0x5f3759df is a real mystery. In the article "Fast Inverse Square Root" [2] , mathematician Chris Lomont of
Purdue
University researches the meaning of this magic constant. Using several elaborate techniques, Lomont tries to mathematically calculate this constant himself. The results are very surprising – Lomont's well calculated mathematically optimal constant turns out to be slighltly different (0x5f37642f) , and in spite of being theoretically better, it yields worse results then the original constant used in the source code!! Indeed, John Carmack must have used genuine black magic to find this number.
Only by numerically searching for another better constant, Lomont found one, which is slightly better that the original. However, in practice the two constants are likely to yield similar results. Lomont proposes this function with the better constant:
float InvSqrt(float x)
{ float xhalf = 0.5f*x;
int i = *(int*)&x; // get bits for floating value
i = 0x5f375a86- (i>>1); // gives initial guess y0
x = *(float*)&i; // convert bits back to float
x = x*(1.5f-xhalf*x*x); //
Newton step, repeating increases accuracy
return x;
}
References
Quake III Source Code
Fast Inverse Square Root , By Chris Lomont
Physics Post - Newton's Approximation of Roots
General history of John Carmack's ID software - "The Wizardry of Id" by David Kushner, editor of IEEE's Spectrum Online magazine
from:http://www.codemaestro.com/reviews/review00000105.html
The article was originally created by Tomer Margolin and was originally published at www.CodeMaestro.com"
Any 3D engine draws it's power and speed from the mathematical models and implementations within, and trust John Carmack of ID software for using really good hacks. As it turns out, a very interesting hack is used in Quake III to calculate an inverse square root.
Preface
ID software has recently released the source code of Quake III engine with a GPL license. In this article we'll see Carmack work his black magic to calculate the square root of a floating point number blazingly fast.
Carmack's Unusual Inverse Square Root
A fast glance at the file game/code/q_math.c reveals many interesting performance hacks. The first thing that pops out is the use of the number 0x5f3759df in the function Q_rsqrt, which calculates the inverse square root of a float and why does this function actually work?
Observe the original function from q_math.c:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
#ifndef Q3_VM
#ifdef __linux__
assert( !isnan(y) ); // bk010122 - FPE?
#endif
#endif
return y;}
Not only does it work, on some CPU's Carmack's Q_rsqrt runs up to 4 times faster than (float)(1.0/sqrt(x), eventhough sqrt() is usually implemented using the FSQRT assembley instruction!
In another file, code/common/cm_trace.c , a neater implementation of the same hack can be found. This time it is used for calculating the square root of a float - sqrt(x). Note that the only real difference is in the return value – instead of returning y, return number*y as the square root:
/*
================
SquareRootFloat
================
*/
float SquareRootFloat(float number) {
long i;
float x, y;
const float f = 1.5F;
x = number * 0.5F;
y = number;
i = * ( long * ) &y;
i = 0x5f3759df - ( i >> 1 );
y = * ( float * ) &i;
y = y * ( f - ( x * y * y ) );
y = y * ( f - ( x * y * y ) );
return number * y;
}
Newton's Approximation of Roots
The code above implements the well known Newton Approximation of roots [3]. As in most other iterative approximation calculations, The Newton approximation is supposed to be ran in iterations; each iteration enhances the accuracy until enough iterations have been made for reaching the desired accuracy.
The general idea behind
Newton's approximation is that whenever we have a guess y for the value of the square root of a number x, we can perform a simple manipulation to get a better guess (one closer to the actual square root) by averaging y with x/y. For example we can compute the square root of 2 as follows; Suppose the initial guess is 1:
2/1 = 2 ; (2 + 1) / 2 = 1.5
2/1.5 = 1.3333; ( 1.5 + 1.3333 ) / 2 = 1.4167
2/1.4167 = 1.4117; ( 1.4167 + 1.4117 ) / 2 = 1.4142
And so on...
As mentioned before,
Newton's approximation is a well known method of fast root calculations. However, Carmack picked an unusual first guess for the first iteration that enhances the approximation drastically and lowers the number of required iterations in Quake III's calculations to only one iteration!
A Witchcraft Number
The really interesting aspect of this function is the magic constant 0x5f3759df, used to calculate the initial guess, in :
i = 0x5f3759df - ( i >> 1 );
Hence, dividing the input by 2 and subtracting it from the magic constant. This constant works almost perfectly - Only one
Newton approximation iteration is required for a low relative error of 10^-3. . As the code illustrates in the commented out second iteration, this approximation is good enough for the Quake III engine.
It turns out that the magic constant 0x5f3759df is a real mystery. In the article "Fast Inverse Square Root" [2] , mathematician Chris Lomont of
Purdue
University researches the meaning of this magic constant. Using several elaborate techniques, Lomont tries to mathematically calculate this constant himself. The results are very surprising – Lomont's well calculated mathematically optimal constant turns out to be slighltly different (0x5f37642f) , and in spite of being theoretically better, it yields worse results then the original constant used in the source code!! Indeed, John Carmack must have used genuine black magic to find this number.
Only by numerically searching for another better constant, Lomont found one, which is slightly better that the original. However, in practice the two constants are likely to yield similar results. Lomont proposes this function with the better constant:
float InvSqrt(float x)
{ float xhalf = 0.5f*x;
int i = *(int*)&x; // get bits for floating value
i = 0x5f375a86- (i>>1); // gives initial guess y0
x = *(float*)&i; // convert bits back to float
x = x*(1.5f-xhalf*x*x); //
Newton step, repeating increases accuracy
return x;
}
References
Quake III Source Code
Fast Inverse Square Root , By Chris Lomont
Physics Post - Newton's Approximation of Roots
General history of John Carmack's ID software - "The Wizardry of Id" by David Kushner, editor of IEEE's Spectrum Online magazine
from:http://www.codemaestro.com/reviews/review00000105.html
相关文章推荐
- [置顶] RSA 平方-乘算法 与 快速幂
- 快速开平方的算法
- 整数快速开平方算法
- 整数快速开平方算法
- 平方根的快速算法
- [转贴]《雷神之锤III》里求平方根倒数的函数(快速平方根(倒数)算法)
- 快速求平方根的算法
- 关于卡马克算法和系统函数库方法快速开平方根快慢的研究
- [转贴]《雷神之锤III》里求平方根倒数的函数(快速平方根(倒数)算法)
- 紫书第八章-----高效算法设计(快速排序水题格式要求严)
- 【重走普及路】【分治】【经典算法】快速幂
- 程序员如何快速准备面试中的算法
- 【坐在马桶上看算法】算法3:最常用的排序——快速排序
- iOS开发--图形化排序算法比较:快速排序、插入排序、选择排序、冒泡排序
- 【算法导论】快速排序
- 算法学习--希尔和快速排序
- Horspool 字符串快速查找算法
- 算法和数据结构——交换排序(冒泡和快速)
- FFT(快速傅立叶算法 for java)
- 算法导论4:快速排序 2016.1.4