您的位置：首页 > 编程语言 > Python开发

关于python处理图片验证码的处理

2016-06-20 16:50 661 查看

前段时间在处理图片验证的问题，登陆窗口出现数字验证码，找了很多资料，请教前辈。到目前为止，有点眉目了，在这里分享一下：

场景：自动化测试是在RF上面展开，处理验证码自然用到python，于是处理验证码的思路有了：从界面截取验证码图片，保存到本地->图片进行降噪处理->调用pytesseract里面的方法，把图片里面的验证码转换为文本。

1)、从界面截取验证码图片，保存到本地：

def cut_pinnum_pic_from_page(xpath, window, pageScreen, pinNumPath):

'''

url='http://10.2.122.143:8000/?HIPPO_TRADE_URL=ws://10.2.122.143:1521#StockQuote'

driver = webdriver.Chrome()

driver.maximize_window() #将浏览器最大化

driver.get(url)

driver.save_screenshot('C:\\aa.png') #截取当前网页，该网页有我们需要的验证码

'''

driver = window

#pageScreen = "C:\\yzm\\page.png"

#pinNumPath = 'C:\\yzm\\pinnum.jpg'

driver.save_screenshot(pageScreen)

imgelement = driver.find_element_by_xpath(xpath)

#获取验证码x,y轴坐标

location = imgelement.location

size=imgelement.size

#写成我们需要截取的位置坐标

rangle=(int(location['x']), int(location['y']), int(location['x'] + size['width']), int(location['y'] + size['height']))

i=Image.open(pageScreen)

#使用Image的crop函数，截取验证码

pin=i.crop(rangle)

pin.save(pinNumPath)

print ":::", pinNumPath, "saved successfully!"

2)、图片进行降噪处理：

def getPixel(image, x, y, G, N):

'''

#二值判断,如果确认是噪声,用改点的上面一个点的灰度进行替换

#该函数也可以改成RGB判断的,具体看需求如何

'''

L = image.getpixel((x, y))

if L > G:

L = True

else:

L = False

nearDots = 0

if L == (image.getpixel((x - 1, y - 1)) > G):

nearDots += 1

if L == (image.getpixel((x - 1, y)) > G):

nearDots += 1

if L == (image.getpixel((x - 1, y + 1)) > G):

nearDots += 1

if L == (image.getpixel((x, y - 1)) > G):

nearDots += 1

if L == (image.getpixel((x, y + 1)) > G):

nearDots += 1

if L == (image.getpixel((x + 1, y - 1)) > G):

nearDots += 1

if L == (image.getpixel((x + 1, y)) > G):

nearDots += 1

if L == (image.getpixel((x + 1, y + 1)) > G):

nearDots += 1

if nearDots < N:

return image.getpixel((x, y-1))

else:

return None

def clearNoise(image, G, N, Z):

'''

降噪

根据一个点A的RGB值，与周围的8个点的RBG值比较，设定一个值N（0 <N <8），当A的RGB值与周围8个点的RGB相等数小于N时，此点为噪点

G: Integer 图像二值化阀值

N: Integer 降噪率 0 <N <8

Z: Integer 降噪次数

@输出

0：降噪成功

1：降噪失败

'''

draw = ImageDraw.Draw(image)

for i in xrange(0, Z):

for x in xrange(1, image.size[0] - 1):

for y in xrange(1, image.size[1] - 1):

color = getPixel(image, x, y, G, N)

if color != None:

draw.point((x, y), color)

def op_clearNoise(oldPic, newPic):

'''

执行去噪处理

'''

image = Image.open(oldPic)

image = image.convert("L")

clearNoise(image, 50, 4, 4)

print ":::clearNoise successfully!"

image.save(newPic)

3)、调用pytesseract里面的方法，把图片里面的验证码转换为文本：

def image_to_string(imPath, cleanup):

im = Image.open(imPath)

try:

util.image_to_scratch(im, scratch_image_name)

call_tesseract(scratch_image_name, scratch_text_name_root)

print ":::Begin to transfer string from image-", imPath

text = util.retrieve_text(scratch_text_name_root)

except Exception as e:

print "::: Exception occured:------------"

print e

finally:

if cleanup is True:

util.perform_cleanup(scratch_image_name, scratch_text_name_root)

return text.rstrip().lstrip()

转换后，可以针对识别的文本进行人为处理，比如有时候会把5识别成$，这时候可以自己写一段脚本进行转换，提高转换效率。

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航