Fastest Gaussian Blur in CPU
2015-09-16 10:20
495 查看
Beginning
My solution is based onFast image convolutions by Wojciech Jarosz.Presented ideas are very simple and I don't know who is the original author. I am going to describe it a little better and add some mathematics.To get motivated, take a glance at
the results. I have implemented this code into
Photopea under Filter - Blur - Gaussian Blur.
Definition
The convolution of two 2D functionsf
and g
is defined as the volume of product of
f
and "shifted" g.
The second function g
is sometimes called "weight", since it determines, how much of
f
will get into the result
The Gaussian blur of a 2D function can be defined as a convolution of that function with 2D
Gaussian function. Our gaussian function has an integral 1 (volume under surface) and is uniquely defined by one parameter
σ
called standard deviation. We will also call it "radius" in the text below.
In our discrete finite case, we represent our 2D functions as matrices of values. We compute the volume (integral) as a sum. Gaussian function has near to zero values behind some radius, so we will use only the values
−r≤x≤r,−r≤y≤r.
This "useful" part of weight is also called the kernel.The value of convolution at [i, j] is the weighted average, i. e. sum of function values around [i, j] multiplied by weight.
Algorithm 1
For a general discrete convolution off
and weight function w,
we can compute the result b
as:
b[i,j]=∑y=i−ri+r∑x=j−rj+rf[y,x]∗w[y,x]
For gaussian weight, we can compute only weights around [i, j] (area of
4⋅r2).
When our matrix has n
values, the time complexity is
O(n⋅r2).
For large radii, e. g. r=10,
we have to do n∗400
operations, which correspond to 400 loops over the whole matrix and that is ugly.
1// source channel, target channel, width, height, radius 2function gaussBlur_1 (scl, tcl, w, h, r) { 3 var rs = Math.ceil(r * 2.57); // significant radius 4 for(var i=0; i<h; i++) 5 for(var j=0; j<w; j++) { 6 var val = 0, wsum = 0; 7 for(var iy = i-rs; iy<i+rs+1; iy++) 8 for(var ix = j-rs; ix<j+rs+1; ix++) { 9 var x = Math.min(w-1, Math.max(0, ix)); 10 var y = Math.min(h-1, Math.max(0, iy)); 11 var dsq = (ix-j)*(ix-j)+(iy-i)*(iy-i); 12 var wght = Math.exp( -dsq / (2*r*r) ) / (Math.PI*2*r*r); 13 val += scl[y*w+x] * wght; wsum += wght; 14 } 15 tcl[i*w+j] = Math.round(val/wsum); 16 } 17}
Algorithm 2
Let's introduce the box blur. It is the convolution of functionf
and weight w,
but weight is constant and lies within a square (box). The nice feature of box blur is, that when you have some weight function having the same variance, it converges to gaussian blur after several passes.
In this algorithm, we will simulate the gaussian blur with 3 passes of box blur. Let's denote the half of size of square as
br
("box radius"). The constant value of weight is
1/(2⋅br)2
(so the sum over the whole weight is 1). We can define box blur as:
bb[i,j]=∑y=i−bri+br∑x=j−brj+brf[y,x]/(2⋅br)2
We have to convert the standard deviation of gaussian blur
r
into dimensions of boxes for box blur. I am not very good at calculus, but fortunatelly I have found
this website and used their implementation.
1function boxesForGauss(sigma, n) // standard deviation, number of boxes 2{ 3 var wIdeal = Math.sqrt((12*sigma*sigma/n)+1); // Ideal averaging filter width 4 var wl = Math.floor(wIdeal); if(wl%2==0) wl--; 5 var wu = wl+2; 6 7 var mIdeal = (12*sigma*sigma - n*wl*wl - 4*n*wl - 3*n)/(-4*wl - 4); 8 var m = Math.round(mIdeal); 9 // var sigmaActual = Math.sqrt( (m*wl*wl + (n-m)*wu*wu - n)/12 ); 10 11 var sizes = []; for(var i=0; i<n; i++) sizes.push(i<m?wl:wu); 12 return sizes; 13}
Our algorithm has still the same complexity
O(n⋅r2),
but it has two advantages: first, the area is much smaller (br
is almost equal to σ,
while significant radius for gaussian is much larger). The second advantage is, that the weight is constant. Even though we have to do it 3 times, it performs faster.
1function gaussBlur_2 (scl, tcl, w, h, r) { 2 var bxs = boxesForGauss(r, 3); 3 boxBlur_2 (scl, tcl, w, h, (bxs[0]-1)/2); 4 boxBlur_2 (tcl, scl, w, h, (bxs[1]-1)/2); 5 boxBlur_2 (scl, tcl, w, h, (bxs[2]-1)/2); 6} 7function boxBlur_2 (scl, tcl, w, h, r) { 8 for(var i=0; i<h; i++) 9 for(var j=0; j<w; j++) { 10 var val = 0; 11 for(var iy=i-r; iy<i+r+1; iy++) 12 for(var ix=j-r; ix<j+r+1; ix++) { 13 var x = Math.min(w-1, Math.max(0, ix)); 14 var y = Math.min(h-1, Math.max(0, iy)); 15 val += scl[y*w+x]; 16 } 17 tcl[i*w+j] = val/((r+r+1)*(r+r+1)); 18 } 19}
Algorithm 3
We have already simplified gaussian blur into 3 passes of box blur. Let's do a little experiment. Let's define ahorizontal blur and total blur:
bh[i,j]=∑x=j−brj+brf[i,x]/(2⋅br)bt[i,j]=∑y=j−brj+brbh[y,j]/(2⋅br)
Those two functions are "looping" in a line, producing "one-dimensional blur". Let's see, what total blur corresponds to:
bt[i,j]=∑y=i−bri+brbh[y,j]/(2⋅br)=∑y=j−brj+br(∑x=j−brj+brf[y,x]/(2⋅br))/(2⋅br)=∑y=i−bri+br∑x=j−brj+brf[y,x]/(2⋅br)2
We just discovered, that our total blur is box blur! Both total blur and horizontal blur have a complexity
O(n⋅r),
so the whole box blur has O(n⋅r).
1function gaussBlur_3 (scl, tcl, w, h, r) { 2 var bxs = boxesForGauss(r, 3); 3 boxBlur_3 (scl, tcl, w, h, (bxs[0]-1)/2); 4 boxBlur_3 (tcl, scl, w, h, (bxs[1]-1)/2); 5 boxBlur_3 (scl, tcl, w, h, (bxs[2]-1)/2); 6} 7function boxBlur_3 (scl, tcl, w, h, r) { 8 for(var i=0; i<scl.length; i++) tcl[i] = scl[i]; 9 boxBlurH_3(tcl, scl, w, h, r); 10 boxBlurT_3(scl, tcl, w, h, r); 11} 12function boxBlurH_3 (scl, tcl, w, h, r) { 13 for(var i=0; i<h; i++) 14 for(var j=0; j<w; j++) { 15 var val = 0; 16 for(var ix=j-r; ix<j+r+1; ix++) { 17 var x = Math.min(w-1, Math.max(0, ix)); 18 val += scl[i*w+x]; 19 } 20 tcl[i*w+j] = val/(r+r+1); 21 } 22} 23function boxBlurT_3 (scl, tcl, w, h, r) { 24 for(var i=0; i<h; i++) 25 for(var j=0; j<w; j++) { 26 var val = 0; 27 for(var iy=i-r; iy<i+r+1; iy++) { 28 var y = Math.min(h-1, Math.max(0, iy)); 29 val += scl[y*w+j]; 30 } 31 tcl[i*w+j] = val/(r+r+1); 32 } 33}
Algorithm 4
One-dimensional blur can be computed even faster. E. g. we want to compute horizontal blur. We computebh[i,j],bh[i,j+1],bh[i,j+2],....
But the neighboring values bh[i,j]
and bh[i,j+1]
are almost the same. The only difference is in one left-most value and one right-most value. So
bh[i,j+1]=bh[i,j]+bh[i,j+r]−bh[i,j−r].
In our new algorithm, we will compute the one-dimensional blur by creating the
accumulator. First, we put the value of left-most cell into it. Then we will compute next values just by editing the previous value in constant time. This 1D blur has the complexity
O(n)
(independent on r).
But it is performed twice to get box blur, which is performed 3 times to get gaussian blur. So the complexity of this gaussian blur is 6 *
O(n).
1function gaussBlur_4 (scl, tcl, w, h, r) { 2 var bxs = boxesForGauss(r, 3); 3 boxBlur_4 (scl, tcl, w, h, (bxs[0]-1)/2); 4 boxBlur_4 (tcl, scl, w, h, (bxs[1]-1)/2); 5 boxBlur_4 (scl, tcl, w, h, (bxs[2]-1)/2); 6} 7function boxBlur_4 (scl, tcl, w, h, r) { 8 for(var i=0; i<scl.length; i++) tcl[i] = scl[i]; 9 boxBlurH_4(tcl, scl, w, h, r); 10 boxBlurT_4(scl, tcl, w, h, r); 11} 12function boxBlurH_4 (scl, tcl, w, h, r) { 13 var iarr = 1 / (r+r+1); 14 for(var i=0; i<h; i++) { 15 var ti = i*w, li = ti, ri = ti+r; 16 var fv = scl[ti], lv = scl[ti+w-1], val = (r+1)*fv; 17 for(var j=0; j<r; j++) val += scl[ti+j]; 18 for(var j=0 ; j<=r ; j++) { val += scl[ri++] - fv ; tcl[ti++] = Math.round(val*iarr); } 19 for(var j=r+1; j<w-r; j++) { val += scl[ri++] - scl[li++]; tcl[ti++] = Math.round(val*iarr); } 20 for(var j=w-r; j<w ; j++) { val += lv - scl[li++]; tcl[ti++] = Math.round(val*iarr); } 21 } 22} 23function boxBlurT_4 (scl, tcl, w, h, r) { 24 var iarr = 1 / (r+r+1); 25 for(var i=0; i<w; i++) { 26 var ti = i, li = ti, ri = ti+r*w; 27 var fv = scl[ti], lv = scl[ti+w*(h-1)], val = (r+1)*fv; 28 for(var j=0; j<r; j++) val += scl[ti+j*w]; 29 for(var j=0 ; j<=r ; j++) { val += scl[ri] - fv ; tcl[ti] = Math.round(val*iarr); ri+=w; ti+=w; } 30 for(var j=r+1; j<h-r; j++) { val += scl[ri] - scl[li]; tcl[ti] = Math.round(val*iarr); li+=w; ri+=w; ti+=w; } 31 for(var j=h-r; j<h ; j++) { val += lv - scl[li]; tcl[ti] = Math.round(val*iarr); li+=w; ti+=w; } 32 } 33}
Results
I was testing all 4 algorithms on an image below (4 channels, 800x200 pixels). Here are the results:Algorithm | Time, r=5 | Time, r=10 | Time complexity |
---|---|---|---|
Algorithm 1 | 7 077 ms | 27 021 ms | O(n⋅r2) |
Algorithm 1 (pre-computed weight) | 2 452 ms | 8 990 ms | O(n⋅r2) |
Algorithm 2 | 586 ms | 2 437 ms | O(n⋅r2) |
Algorithm 3 | 230 ms | 435 ms | O(n⋅r) |
Algorithm 4 | 32 ms | 34 ms | O(n) |
Original image
Perfect Gaussian Blur (Algorithm 1)
Algorithm 2, 3, 4, average error per pixel: 0.04 %
Blur by Adobe Photoshop, average error per pixel: 0.09%
相关文章推荐
- Parcelable和Serializable的区别:
- 毕业后的五年拉开大家差距的原因在哪里?
- Centos6.5部署ftp文件服务器
- IL指令汇总
- Mysql 使用存储过程添加新字段
- linux安装phalcon
- 各种编程语言的深度学习库整理
- 一款功能综合的CSS滑动门特效
- windows下如何安装Python环境和运行python程序
- TCL第三代黑水晶超薄电视C1上市 缔造业界新高度
- 各种编程语言的深度学习库整理
- 8月.CN域名注册量增至1251.5万 环比净增16.9万
- xshell连接虚拟机在家能用到公司就不能用了?
- 程序开发中常用到的颜色
- linux网络编程中常用的辅助函数
- Jetty in Win7
- String,StringBuffer,StringBuilder黑马精华贴
- SVN分支的合并和同步
- IE7 浏览器下面设置text-indent属性变成margin属性BUG
- 让你彻底弄清offset