您的位置:首页 > 编程语言 > Python开发

[CS131 Computer Vision] 图像处理中卷积的理解与Python实现

2018-02-28 22:22 746 查看
博主: Chris_yg

学海无涯,欢迎讨论,共同进步

本文将主要介绍二维卷积公式,性质,计算方法以及Python实现。

1. 二维卷积公式及性质

在图像处理中,图片由离散的像素组成,卷积运算通常用于表示某一像素邻域的加权和,二维卷积的离散形式如下:

g(m,n)=f∗h=∑k=−∞∞∑l=−∞∞f(m,n)h(m−k,n−l)g(m,n)=f∗h=∑k=−∞∞∑l=−∞∞f(m,n)h(m−k,n−l)

卷积运算满足以下性质:

交换律:f∗h=h∗ff∗h=h∗f

结合律:f∗(g∗h)=(f∗g)∗hf∗(g∗h)=(f∗g)∗h

分配律:f∗(g+h)=f∗g+f∗hf∗(g+h)=f∗g+f∗h

2.二维卷积的计算方法及python实现

(1) 利用原始公式进行计算,需要4层嵌套循环:

设 f 大小为 (M1,N1)(M1,N1), h 大小为 (M2,N2)(M2,N2) ,卷积公式可表示如下:

g(m,n)=f∗h=h∗f=∑k=0M1−1∑l=−0N1−1h(m,n)f(m−k,n−l)g(m,n)=f∗h=h∗f=∑k=0M1−1∑l=−0N1−1h(m,n)f(m−k,n−l)

其中,0≤m<M1+M2−1,0≤m<N1+N2−10≤m<M1+M2−1,0≤m<N1+N2−1

利用上述公式计算所得如下图中full区域所示,实际上在图像处理中,我们所需的为same区域,即保持图像大小在卷积前后保持不变。



import numpy as np

def conv_nested(image, kernel):
"""A naive implementation of convolution filter.

This is a naive implementation of convolution using 4 nested for-loops.
This function computes convolution of an image with a kernel and outputs
the result that has the same shape as the input image.

Args:
image: numpy array of shape (Hi, Wi)
kernel: numpy array of shape (Hk, Wk)

Returns:
out: numpy array of shape (Hi, Wi)
"""
Hi, Wi = image.shape
Hk, Wk = kernel.shape
out = np.zeros((Hi, Wi))

temp_m = np.zeros((Hi+Hk-1, Wi+Wk-1))     # 所得为 full 矩阵
for i in range(Hi+Hk-1):
for j in range(Wi+Wk-1):
temp = 0
# 通常来说,卷积核的尺寸远小于图片尺寸,同时卷积满足交换律,为了加快运算,可用h*f 代替 f*h 进行计算
for m in range(Hk):
for n in range(Wk):
if ((i-m)>=0 and (i-m)<Hi and (j-n)>=0 and (j-n)<Wi):
temp += image[i-m][j-n] * kernel[m]

temp_m[i][j] = temp
# 截取出 same 矩阵 (输出尺寸同输入)
for i in range(Hi):
for j in range(Wi):
out[i][j] = temp_m[int(i+(Hk-1)/2)][int(j+(Wk-1)/2)]

return out


(2) 旋转卷积核180°,原始图像进行zero-padding,随后滑动卷积核加权求和:



此过程计算效率比第一种方法高。卷积核的旋转可通过两次翻转完成(分别对x,y轴进行),代码如下:

def zero_pad(image, pad_height, pad_width):
""" Zero-pad an image.

Ex: a 1x1 image [[1]] with pad_height = 1, pad_width = 2 becomes:

[[0, 0, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 0]]         of shape (3, 5)

Args:
image: numpy array of shape (H, W)
pad_width: width of the zero padding (left and right padding)
pad_height: height of the zero padding (bottom and top padding)

Returns:
out: numpy array of shape (H+2*pad_height, W+2*pad_width)
"""

H, W = image.shape
out = None

out = np.zeros((H+2*pad_height, W+2*pad_width))
out[pad_height:pad_height+H, pad_width:pad_width+W] = image

return out

def conv_fast(image, kernel):
""" An efficient implementation of convolution filter.

This function uses element-wise multiplication and np.sum()
to efficiently compute weighted sum of neighborhood at each
pixel.

Hints:
- Use the zero_pad function you implemented above
- There should be two nested for-loops
- You may find np.flip() and np.sum() useful

Args:
image: numpy array of shape (Hi, Wi)
kernel: numpy array of shape (Hk, Wk)

Returns:
out: numpy array of shape (Hi, Wi)
"""
Hi, Wi = image.shape
Hk, Wk = kernel.shape
out = np.zeros((Hi, Wi))

pad_height = Hk // 2
pad_width = Wk // 2
image_padding = zero_pad(image, pad_height, pad_width)
kernel_flip = np.flip(np.flip(kernel, 0), 1)

for i in range(Hi):
for j in range(Wi):
out[i][j] = np.sum(np.multiply(kernel_flip, image_padding[i:(i+Hk), j:(j+Wk)]))

return out


(3) 利用傅里叶变换

主要利用

F(f∗h)=F(f)⋅F(h)F(f∗h)=F(f)·F(h)

f∗h=F−1(F(f)⋅F(h))f∗h=F−1(F(f)·F(h))

其中,F表示傅里叶变换,F−1F−1为傅里叶逆变换

还没写代码,有兴趣的请自行编写。
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息