您的位置:首页 > 其它

基于IE浏览器的色情图片过滤器的设计和实现

2010-01-28 12:26 405 查看
互联网已成为学习知识及开阔视野的最佳途径,它正在逐渐发展成为大众伸手可及的媒体传播手段和通讯工具;然而互联网也带来诸如色情小说、色情图像传播的问题。一些预防网络色情的软件相继被推出。但是,所有这些软件包中的算法不是基于IP地址的过滤,就是基于网页中文本内容的判断。前者是通过IP地址的匹配屏蔽相应信息,需要定期地将收集到的IP地址更新到IP地址库中,具有明显的滞后性;后者由于文本内容的局限性会造成某些相关站点被漏过,许多有益站点被屏蔽,准确性不高。只有针对图像内容的分析与理解才能从根本上解决目前网络安全技术对图像信息过滤与监控能力不足的问题。本过滤器就运用了基于内容的图像过滤技术(CBIF—Content-Based Image Filtering),采用肤色检测并辅助以纹理处理。色情图像的过滤可以看作是一个图像理解与识别问题,但是它与一般的人脸识别和指纹识别有所不同,主要是由于图像的背景条件比较复杂、光照条件不一致、人体的表现形式具有多姿态性,因此很难用一个简单的模型把所有的特征表征出来。根据色情图像本身的特点——有裸露肌肤,我们利用肤色检测模型与纹理模型相结合,并且采用相应的分类算法来建立过滤器模型。彩色信息常常作为一种有效的特征,在图像分割中得到应用。在不同的光照条件下,虽然物体颜色的亮度会产生很大的差异,但是它的色度具有恒常性,基本保持不变。肤色大致上分为黄色、棕色、黑色和白色等,它们在颜色空间都对应着一定的分布区域,形成特定的模式。 YUV和YIQ是在图像处理研究领域经常用到的两种颜色空间。色度空间模型是采用YUV和YIQ两种颜色空间在肤色方面的特性来进行肤色检测的。YUV颜色模型中,Y表示亮度,U和V为色度信号。色度信号是一个二维矢量,称之为色度信号矢量。每一种颜色对应一个色度信号矢量,它的饱和度用模值Ch表示,色调由相位角θ表示。以相位角θ为特征进行图像分割可以去掉与肤色在色调上有较大区别的背景,但该肤色模型难以把肤色与棕色头发或灰色背景分开。由于人脸肤色包含了较多的黄色分量,我们可以在YIQ空间利用彩色的饱和度信息来增强分割效果。I分量代表了从桔黄到蓝绿的色调,值越小,含的黄色越多,蓝绿色越小。采用YUV空间的相位角θ和YIQ空间的I分量作为特征,能够确定肤色的色度信息分布范围。统计发现肤色色调范围具有规范性, 的变化范围基本在[100,150],I分量范围为[20,90]。采用肤色模型检测待过滤图像,得到初步的掩码图像;但是由于其它非肌肤物体颜色与肌肤颜色相似,可能造成误检,误差较大;因此需要在上一步的基础上采用纹理模型来处理得到的掩码图像,把那些非皮肤的肤色区域去掉,这样就可以更准确的识别出色情图像中的肤色区域,提高正检率,降低误检率。所有视觉表面都具有纹理特征,它包含物体结构及其与周围环境的关系等重要信息。纹理模型需要对皮肤像素的纹理特征进行分析,找到其共同性;把经肤色检测后得到的误检区域去掉,正检区域保留。灰度统计的方法是图像处理中的基本方法,在实现中使用了一阶灰度统计的思想来判断图像中某点及其周围部分是否具有皮肤区域的光滑特性。首先得到区域的统计灰度值,一般为平均值或区域中心的像素灰度值;然后区域中的像素值与统计值比较产生特征从而判断其是否与统计出的皮肤区域的统计特征一致。为了能够向分类器提供一个较好的分类向量,需要在掩码图像基础上从原色情图像中提取特征值。在图像的索引与检索领域内,比较成熟的特征索引是颜色、纹理和一些低层的、简单的形状特征和物体空间方位关系。这些特征计算简单、性能稳定。颜色是一种重要的视觉信息属性,在图像索引与检索中是一种很有用的特征。颜色特征非常稳定,对于旋转、平移、尺度变化甚至各种形变都不敏感,表现出相当强的鲁棒性,并且颜色特征计算简单,因此成为现有检索系统中应用最广泛的特征。可以将图像检索与过滤的特征机制应用于色情图像的检测与过滤。色情图像本身的特征即是肌肤裸露,因此可以将色情图像的肤色特征作为其与正常图像的分类向量。基于肤色掩码从色情图像中提取并计算出裸露肌肤的颜色特征向量,然后通过分类器预测给定的图像是否为色情图像。通过对大量色情图像的分析,我们提取七个特征作为分类特征向量进行分类器的训练及色情图像的过滤。本色情图片过滤器利用BHO(浏览器助手对象)技术实现。BHO可以在浏览器的地址空间内引入处理代码。Internet Explorer 及其助手对象实际工作方式是程序在众所周知的、预先指定的磁盘空间寻找附加模块,加载、初始化它们,然后完成预先设计的工作。Internet Explorer可以使用浏览器助手对象编写组件——进程内的组件对象模型(COM)组件。这些组件和浏览器运行在相同的内存上下文里,并且能在可用的窗口和模块里完成任何操作。一个BHO能检测到浏览器的典型事件,如GoBack、GoForward和 DocumentComplete等;安装挂钩以监视消息和操作。BHO连接在浏览器的主窗口上;每创建一个浏览器窗口,就创建了该对象的一个新实例。在打开浏览器的时候,用BHO 技术开发的过滤器一起启动。在导航到一个新的地址时,过滤器首先取得填入的URL地址,判断URL地址是否是禁止的(与存在文件中的禁止的URL地址进行比较)。如果不是则导航到新的地址;否则,关闭网页。在导航到新的地址后,浏览器下载资源;下载完毕后,对图片进行处理:取得浏览器接口指针,得到图片信息(如数量等),逐一对图片进行处理。根据图片的名字,查找到图片文件的位置,利用构造的色情图片检测器对图片进行处理。如果是色情图片,发出警告,将网页关闭,将该URL存入相应文件中;否则显示网页。实验与分析表明,本系统对于色情图像有较好的识别效果,具有较高的智能性、鲁棒性和高效性,基本上实现了在不影响网络正常运行条件下的图像在线监测与分析功能,解决了Web页面色情图片过滤这一难点而又迫在眉睫的问题。系统可以在改进色情图片检测程序,提高正确率和效率;完善系统框架等方面进行改进。关键词:过滤;色度空间;RGB;YUV;YIQ;浏览器;浏览器助手对象;色情图片 Internet has become the best way to get knowledge and broaden views. It is a medium spread means and communication tool. But it also brings the problem of erotic novels and images spread. Some anti-porn filter software is designed. But the algorithms of the software are based on IP address filter or web page text content judgement. The former shields messages by IP address matching, it needs to change the IP address database by the collected termly IP addresses, so it lags evidently. The later may neglect some sites and shield some good sites because of the limitation of text content. Only content-based analysis and comprehension can radically solve the problem that net-security technology has short ability of image information filter and supervision. This filter uses content-based image filtering technology. It adopts complexion detection and texture procession. Erotic image filter is a problem of image comprehension and recognition, but it is different from face recognition and fingerprint recognition. It is difficult to show all the properties by a simple model because of the complex background of images and different illumination condition and diversities of body exhibiting form. Erotic images are characteristic of bareness skin, so we use skin detecting models, texture models and corresponding classing algorithms to make the filter model. Color information is usually an effective feature in image division. The lightness of the object color is very different under different illumination condition, but the chroma is constant and keeps fixed generally. Complexion has yellow, purple, black, white and so on. It corresponds with some distributing area in color spaces and forms certain patterns. YUV and YIQ are two color spaces often used in image processing. Chroma space model adopts traits in skin color of YUV and YIQ color spaces. In YUV color space, Y denote brightness, U and V are chroma signals. It is a two-dimension vector called chroma signal vector. Every color corresponds a chroma signal vector. Model of Ch is its saturation, the angle of θ is hue. Image division by the character of θ can filter the background which is very different from skin color, but it cannot divide skin from the purple hair and gray background. We can increase the division effect by the saturation information in YIQ space because human face has much yellow weight. I weight is the hue from orange yellow to green. The less it is, the more yellow it has, and the less green it has. We can confirm the range of color information distributing by the angle θin YUV space and I in YIQ space. Stat shows that the range of complexion hue is regular. The range of θ is [100,150], the range of I is [20,90]. We can get primary mask images by processing the images through complexion model. But they may have error and maybe much because of the similitude between non-skin objects and skin. So we need adopt texture model to process the mask images based on the previous step. Non-skin skin area will be eliminated, the skin area in erotic images will be exact, the just detecting ratio is increased, and the error ratio is decreased. Vision faces all have texture traits, which include the information about the object structure and the relation between objects and circumstance. Texture model needs to analyse the texture traits of skin pixels for intercommunity. It can eliminate the error area in skin detecting and keep the right area. The gray stat method is basis in image processing. In the implementation, we use the idea of one-rank-gray stat to judge whether a point and around part are smooth in skin area. First, we get the area stat gray value, generally it is average value or the gray value of the pixel in the area center, second compare the pixel value in the area with the stat value to get the traits, and then judge whether it is consistent with the stat traits of the stat skin area. We should get eigenvalues from the images based on the mask images to provide a good classing vector for the filter. In the image indexing and image searching fields, color, texture, some elementary figure traits and relation of object space orientation are mature character indexes. Those characters are simple and stable. Color is an important trait of vision information. It is a useful character in image indexing and image searching. Color character is very stable, insensitive for rotation, movement, scale change and even all kinds of form changes. It is very robust and simple in computation, so it is the most extensive trait in indexing system today. The mechanism of image searching and filtering can be implemented in erotic image detecting and filtering. The character of erotic images is skin bareness, so skin characters of erotic images can be used as a classing vector for the images. Then classing implement can judge whether the image is an erotic image. Through analyzing a lot of erotic images, we extract seven characters as classing eigenvectors to educate the filter and filtering images. The erotic image filter is based on BHO (browser helper object) technology. BHO can bring management codes into the address space of browser. When Internet Explorer and BHO work, they search additional modules in well-known and preassigned disk space, load and initialize them, and then finish work designed previously. Internet Explorer can use BHO compile modules--the Component Object Models (COM). Those modules and the browser run in the same memory context. They can accomplish any operation in usable windows and modules. BHO can detect representative events of IE, such as GoBack, GoForward, DocumentComplete and so on. It can install pothooks to watch messages and operation. BHO connects with the main window. A new example of the object is created when a browser window is created. When IE is opened, the filter based on BHO technology startups at the same time. When navigating a new address, the filter gets the infilling URL address firstly, judges whether the URL is prohibited (compared with the URLs address in the file). If it is not prohibited, IE navigates to the new address, else closes the web page. After that, IE downloads resources, and processes the images: gets the pointer of the browser, gets the image information (such as quantity), processes images one by one. The filter searches the image location by the image name, and processes the images. If the image is an erotic image, the filter sends off notice, closes the web page, keeps the URL in the file, else shows the web page. Experiments and analysis show that the system has good recognizing effect. It has high capacity, robustness, and high performance, by and large realizes to supervise and analyse images online with little infection on the net normal operation, solves the difficult and urgent problem of detecting erotic images in web pages. The system can be mended in erotic image detecting program, improving on the just ratio and effectiveness, mending the system frame and so on. Key words: filter, chroma-space, RGB, YUV, YIQ, browser, BHO, erotic image

本文来自: 聚合吧(http://www.juhe8.com/) 详细出处参考:http://www.juhe8.com/lunwen/qita/2008-01-13/83974.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: