您的位置:首页 > 运维架构

A Hierarchical Deep Temporal Model for Group Activity Recognition

2017-02-10 14:15 344 查看

一:主要内容

文章主要提出了分层的框架。

(1)A
deep CNN model is used to extract complex features for each person in addition to the temporal features captured by the first LSTM layer

(2)The concatenation of the CNN features and the LSTM layer represent temporal features for
a person. Various pooling strategies can be used to aggregate these features over all people in the scene at each time step

(3)The
output of the pooling layer forms our representation for the group activity. The second LSTM network, working on top of the temporal representation, is used to directly model the temporal dynamics of group activity. The LSTM layer of the second network is
directly connected to a classification layer in order to detect group activity classes in a video sequence.

二:系统框架





三:主要公式



四:Implementation Details

We trained our model in two steps. In the first step, the person-level CNN and the first LSTM layer are
trained in an end-to-end fashion using a set of training data consisting of person tracklets annotated with action labels. We implement our model using Caffe [14]. Similar to other approaches [9, 7, 38], we initialize our CNN model with the pre-trained AlexNet
network and we fine-tune the whole network for the first LSTM layer. 9 timesteps and 3000 hidden nodes are used for the first LSTM layer and a softmax layer is deployed for the classification layer in this stage. After training the first LSTM layer, we concatenate
the fc7 layer of AlexNet and the LSTM layer for every person and pool over all people in a scene. The pooled features, which correspond to frame level features, are fed to the second LSTM network. This network consists of a 3000-node fully connected layer
followed by a 9-timestep 500-node LSTM layer which is passed to a softmax layer trained to recognize group activity labels.

五:总结

作为入group activity的第一篇论文,读第一遍没什么感觉,读几遍后感觉网络很有逻辑性,CNN提取每个人特征,first LSTM 把CNN特征用时间轴串起来,采用max pooling 方式聚合特征,最后全连接送给top LSTM,接softmax
分类器。volleyball dataset 还是很有用的,是可以留意的数据库。但是 检测人 和 对人的跟踪这方面还不是很了解,所以没有跑程序。可以参考github:github
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  gruop activity LSTM
相关文章推荐