您的位置：首页 > Web前端

经典论文学习bag of feature（二）

2014-12-17 16:38 218 查看

Bag-of-word
Bag-of-words模型是信息检索领域常用的文档表示方法。在信息检索中，BOW模型假定对于一个文档，忽略它的单词顺序和语法、句法等要素，将其仅仅看作是若干个词汇的集合，文档中每个单词的出现都是独立的，不依赖于其它单词是否出现。例如有如下两个文档：

1：Bob likes to play basketball, Jim likes too. 2：Bob also likes to play football games.

基于这两个文本文档，提取单个单词，并构造一个词典：

Dictionary = {1:”Bob”, 2. “like”, 3. “to”, 4. “play”, 5. “basketball”, 6. “also”, 7. “football”, 8. “games”, 9. “Jim”, 10. “too”}。

这个词典一共包含10个不同的单词，根据词典，对上面两个文档中的单词出现次数进行统计，每个文档可表示为10维向量。如下：

1：[1, 2, 1, 1, 1, 0, 0, 0, 1, 1] 2：[1, 1, 1, 1 ,0, 1, 1, 1, 0, 0]

若每种类型的文档中单词的直方图统计呈现特定的规律，则可以利用这种规律进行海量文档归类。

Bag Of Feature

1.1 [CVPR06] Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

摘要：将BOW的思想引入到图像中来，word在图像中用一种特定的特征描述子来代替，但这样完全忽略了图像的空间布局关系，incapable of capturing shape or of segmenting an object from its background，因此结合空间金字塔匹配来实现。 Our method involves repeatedly subdividing the image and computing histograms of local features at increasingly fine resolutions.

比较：以下三方面和传统方法比较：

locally orderless images：SPM as an alternative formulation of a locally orderless image, instead of a Gaussian scale space of apertures,define a fixed hierarchy of rectangular windows.
multiresolution histograms：fixing the resolution at which the features are computed, but varying the spatial resolution at which they are aggregated.
subdivide and disorder：the best subdivision scheme may be achieved when multiple resolutions are combined in a principled way; the reason for the empirical success of “subdivide and disorder” techniques is the fact that they actually perform approximate geometric matching.

Pyramid Match Kernels:

xy表示两个矢量 PyramidMatch用来计算xy之间的appriosimate correspondence.通过placing a sequence of increasingly coarser grids over the feature space and taking a weighted sum of the number of matches that occur at each level of resolution.

Match means they fall into the same cell. Resolution counts from 0 to L.

At level l,image can be divide into 2exp(d*l) cells(这里的cell应该就是后面的聚类中心？);The number of matchs at level l is given by (1);

The weight number of level l is set to 1/(2exp(L-l)),Note lower(coarser) level include the num of finer level, so the num of level l is given by

.The Pyramid match kernel can be given by(2);

(1)

(2)

Spatial Matching Scheme

perform pyramid matching in the two-dimensional image space, and use traditional clustering techniques in feature space.(对于图像中feature空间，图像的坐标已经包含了几何空间信息，只需要按照坐标顺序排列vector即可)(特征空间用聚类将特征聚到M个类别channel，大概就是上面讲的fall into the same cell，H就是用直方图统计，I越小表示二者相关度越小)

(3)

Dimension is:

(上式中k(x,y) 中并不是相加而是每level的I矢量连接成一个很长的矢量) ; M=400 L=3 d=34000 is long and sparse.

Normalize all histograms by the total weight of all features in the image.

(1)

(2)

histogram intersection function

用来对特征构成的直方图进行相似度匹配.计算公式(即式(1))：

上图(2)是对直方图交叉核函数的描述图：(a)里的y和z代表两种数据分布，三幅图代表三层金字塔，每一层里有间距相等的虚线，表示直方图宽度，金字塔L越大宽度越小，间隔越多。可以看到红点蓝点的位置是固定的，但是根据直方图宽度的不同可以划到不同的直方图里，如(b)所示。(c)图就是L的计算结果，是通过(b)里两种直方图取交集得来的，c图每个图的下方都给出了交集数目，比如x0=2,x1=4,x2=3（原图里是5，是不是错了？）

Q：

对照例子和代码：(1)在此处的作用应该是通过类似SPM的可核函数(2)/(3)计算得到H/H'(包含特征的直方图统计结果，但是其中的I含义和原文不一样)，cell是聚类中心，match是指在同一个聚类中心，H是通过同cluster的统计直方图，然后用(1)计算I，即numofmatchs，通过直方图内核函数来计算(the histogram intersection function).

按照文章理解：应该是先计算1 然后 2/3来计算。进一步可以看：The Pyramid Match Kernel:Discriminative Classification with Sets of Image Features理解。自己理解是：(2)/(3)权重常数和channel可以乘进去或者一开始就考虑，所以例子和代码看起来就是先考虑2/3,然后1最后计算决定numofmatch的，即两个直方图的相似度（交叉核）。原文中有

的表示。

局部和全局特征表示：本文中说到SPM是一种approximate global geometric correspondence，又如何理解an alternative formulation of a locally orderless image,传统的局部和全局特征是怎样定义的，有哪些？？

ps：

部分来源于：http://blog.csdn.net/v_JULY_v

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航