caffe源码学习之Proto数据格式【1】
2017-06-16 12:35
459 查看
前言:
由于业务需要,接触caffe已经有接近半年,一直忙着阅读各种论文,重现大大小小的模型. 期间也总结过一些caffe源码学习笔记,断断续续,这次打算系统的记录一下caffe源码学习笔记,巩固一下C++,同时也梳理一下自己之前的理解。
正文:
我们先不看caffe的框架结构,先介绍一下caffe.proto,是google开源的一种数据交互格式--Google Protobuf,这种数据的格式,我们可以看到caffe.proto中内容:
当我们在编译完成caffe之后,会自动在src/caffe/proto中生成两个文件caffe.pb.h 和caffe.pb.cc
那么这种数据格式在程序中是如何被使用的呢? 我们举一个简单的例子,来演示caffe.proto生成caffe.pb.h和caffe.pb.cc以及被调用的过程
如果你之前能够编译caffe成功,则说明你已经成功安装了Protobuf,那么我简单的编写一个使用Protobuf的例子吧.
我们先编写一个文件caffe.proto:
然后我们执行如下操作。
protobuf会自动生成.cc和.h文件
caffe.pb.cc
生成之后,我们编写一个读取文件caffeReader.cpp
编译完成之后,我们执行如下命令:
生成二进制执行文件caffeReader.我们运行该文件就会显示如下信息:
知道了proto文件是如何使用之,再去caffe.proto中看看caffe中定义的结构体:
message 表示需要传输的的参数的结构体.
caffe.proto中保存的有二进制大文件Blob的结构信息:BlobProto, BlobProtoVector, Datum
BlobProto:
BlobProtoVector:BlobProto的动态数组
Datum是lmdb的数据格式,详情可以看这里 http://blog.csdn.net/u010668907/article/details/51834411
剩下的大部分都是Layer部分和Net部分和Solver部分的参数,关于Blob和Layer,Net还有Solver这四个部分的关系:
blob作为贯穿整个框架的数据单元,Sovler通过sovler.prototxt【我觉得我可能需要说明一下:.proto和.prototxt的区别吧,这两个都是google protobuff的文件,.proto用来定义结构体参数,.prototxt用来相应的.proto中的结构体的初始化数据】配置初始化Net,然后Net通过调用trainval.prototxt这些参数,来调用对应的Layer,并将数据blob输入到相应的Layer中,Layer来对流入的数据进行计算处理,然后再将计算后的blob数据返回,通过Net流向下一个Layer,每次执行一次,Solver就会计数一次,然后调整learn_rate,descay_weith等权值,参考的是caffe论坛上的这个话题:这里附上链接
http://www.caffecn.cn/?/question/123&sort_key=agree_count&sort=DESC
由于业务需要,接触caffe已经有接近半年,一直忙着阅读各种论文,重现大大小小的模型. 期间也总结过一些caffe源码学习笔记,断断续续,这次打算系统的记录一下caffe源码学习笔记,巩固一下C++,同时也梳理一下自己之前的理解。
正文:
我们先不看caffe的框架结构,先介绍一下caffe.proto,是google开源的一种数据交互格式--Google Protobuf,这种数据的格式,我们可以看到caffe.proto中内容:
syntax = "proto2"; package caffe; //caffe.prto中的各个结构封装在caffe包中,可以通过using namespace caffe; 或者caffe::** 调用 // Specifies the shape (dimensions) of a Blob. message BlobShape { repeated int64 dim = 1 [packed = true]; } message BlobProto { optional BlobShape shape = 7; repeated float data = 5 [packed = true]; repeated float diff = 6 [packed = true]; repeated double double_data = 8 [packed = true]; repeated double double_diff = 9 [packed = true]; // 4D dimensions -- deprecated. Use "shape" instead. optional int32 num = 1 [default = 0]; optional int32 channels = 2 [default = 0]; optional int32 height = 3 [default = 0]; optional int32 width = 4 [default = 0]; } .......
当我们在编译完成caffe之后,会自动在src/caffe/proto中生成两个文件caffe.pb.h 和caffe.pb.cc
那么这种数据格式在程序中是如何被使用的呢? 我们举一个简单的例子,来演示caffe.proto生成caffe.pb.h和caffe.pb.cc以及被调用的过程
如果你之前能够编译caffe成功,则说明你已经成功安装了Protobuf,那么我简单的编写一个使用Protobuf的例子吧.
我们先编写一个文件caffe.proto:
package caffe; message student { required int32 age = 1; //ID required 表示必要字段 required string name = 2; //str 必要字段 optional int32 grade = 3 ; //optional field 可选字段,可以有无,最多b }
然后我们执行如下操作。
protoc -I=. --cpp_out=. ./caffe.proto
protobuf会自动生成.cc和.h文件
// Generated by the protocol buffer compiler. DO NOT EDIT! // source: caffe.proto #define INTERNAL_SUPPRESS_PROTOBUF_FIELD_DEPRECATION #include "caffe.pb.h" #include <algorithm> #include <google/protobuf/stubs/common.h> #include <google/protobuf/stubs/once.h> #include <google/protobuf/io/coded_stream.h> #include <google/protobuf/wire_format_lite_inl.h> #include <google/protobuf/descriptor.h> #include <google/protobuf/generated_message_reflection.h> #include <google/protobuf/reflection_ops.h> #include <google/protobuf/wire_format.h> // @@protoc_insertion_point(includes) namespace caffe { namespace { const ::google::protobuf::Descriptor* student_descriptor_ = NULL; const ::google::protobuf::internal::GeneratedMessageReflection* student_reflection_ = NULL; } // namespace void protobuf_AssignDesc_caffe_2eproto() { protobuf_AddDesc_caffe_2eproto(); const ::google::protobuf::FileDescriptor* file = ::google::protobuf::DescriptorPool::generated_pool()->FindFileByName( "caffe.proto"); GOOGLE_CHECK(file != NULL); student_descriptor_ = file->message_type(0); static const int student_offsets_[3] = { GOOGLE_PROTOBUF_GENERATED_MESSAGE_FIELD_OFFSET(student, age_), GOOGLE_PROTOBUF_GENERATED_MESSAGE_FIELD_OFFSET(student, name_), GOOGLE_PROTOBUF_GENERATED_MESSAGE_FIELD_OFFSET(student, grade_), }; student_reflection_ = new ::google::protobuf::internal::GeneratedMessageReflection( student_descriptor_, student::default_instance_, student_offsets_, GOOGLE_PROTOBUF_GENERATED_MESSAGE_FIELD_OFFSET(student, _has_bits_[0]), GOOGLE_PROTOBUF_GENERATED_MESSAGE_FIELD_OFFSET(student, _unknown_fields_), -1, ::google::protobuf::DescriptorPool::generated_pool(), ::google::protobuf::MessageFactory::generated_factory(), sizeof(student)); } namespace { GOOGLE_PROTOBUF_DECLARE_ONCE(protobuf_AssignDescriptors_once_); inline void protobuf_AssignDescriptorsOnce() { ::google::protobuf::GoogleOnceInit(&protobuf_AssignDescriptors_once_, &protobuf_AssignDesc_caffe_2eproto); } void protobuf_RegisterTypes(const ::std::string&) { protobuf_AssignDescriptorsOnce(); ::google::protobuf::MessageFactory::InternalRegisterGeneratedMessage( student_descriptor_, &student::default_instance()); } } // namespace void protobuf_ShutdownFile_caffe_2eproto() { delete student::default_instance_; delete student_reflection_; } void protobuf_AddDesc_caffe_2eproto() { static bool already_here = false; if (already_here) return; already_here = true; GOOGLE_PROTOBUF_VERIFY_VERSION; ::google::protobuf::DescriptorPool::InternalAddGeneratedFile( "\n\013caffe.proto\022\005caffe\"3\n\007student\022\013\n\003age\030\001" " \002(\005\022\014\n\004name\030\002 \002(\t\022\r\n\005grade\030\003 \001(\005", 73); ::google::protobuf::MessageFactory::InternalRegisterGeneratedFile( "caffe.proto", &protobuf_RegisterTypes); student::default_instance_ = new student(); student::default_instance_->InitAsDefaultInstance(); ::google::protobuf::internal::OnShutdown(&protobuf_ShutdownFile_caffe_2eproto); } // Force AddDescriptors() to be called at static initialization time. struct StaticDescriptorInitializer_caffe_2eproto { StaticDescriptorInitializer_caffe_2eproto() { protobuf_AddDesc_caffe_2eproto(); } } static_descriptor_initializer_caffe_2eproto_; // =================================================================== #ifndef _MSC_VER const int student::kAgeFieldNumber; const int student::kNameFieldNumber; const int student::kGradeFieldNumber; #endif // !_MSC_VER student::student() : ::google::protobuf::Message() { SharedCtor(); } void student::InitAsDefaultInstance() { } student::student(const student& from) : ::google::protobuf::Message() { SharedCtor(); MergeFrom(from); } void student::SharedCtor() { _cached_size_ = 0; age_ = 0; name_ = const_cast< ::std::string*>(&::google::protobuf::internal::kEmptyString); grade_ = 0; ::memset(_has_bits_, 0, sizeof(_has_bits_)); } student::~student() { SharedDtor(); } void student::SharedDtor() { if (name_ != &::google::protobuf::internal::kEmptyString) { delete name_; } if (this != default_instance_) { } } void student::SetCachedSize(int size) const { GOOGLE_SAFE_CONCURRENT_WRITES_BEGIN(); _cached_size_ = size; GOOGLE_SAFE_CONCURRENT_WRITES_END(); } const ::google::protobuf::Descriptor* student::descriptor() { protobuf_AssignDescriptorsOnce(); return student_descriptor_; } const student& student::default_instance() { if (default_instance_ == NULL) protobuf_AddDesc_caffe_2eproto(); return *default_instance_; } student* student::default_instance_ = NULL; student* student::New() const { return new student; } void student::Clear() { if (_has_bits_[0 / 32] & (0xffu << (0 % 32))) { age_ = 0; if (has_name()) { if (name_ != &::google::protobuf::internal::kEmptyString) { name_->clear(); } } grade_ = 0; } ::memset(_has_bits_, 0, sizeof(_has_bits_)); mutable_unknown_fields()->Clear(); } bool student::MergePartialFromCodedStream( ::google::protobuf::io::CodedInputStream* input) { #define DO_(EXPRESSION) if (!(EXPRESSION)) return false ::google::protobuf::uint32 tag; while ((tag = input->ReadTag()) != 0) { switch (::google::protobuf::internal::WireFormatLite::GetTagFieldNumber(tag)) { // required int32 age = 1; case 1: { if (::google::protobuf::internal::WireFormatLite::GetTagWireType(tag) == ::google::protobuf::internal::WireFormatLite::WIRETYPE_VARINT) { DO_((::google::protobuf::internal::WireFormatLite::ReadPrimitive< ::google::protobuf::int32, ::google::protobuf::internal::WireFormatLite::TYPE_INT32>( input, &age_))); set_has_age(); } else { goto handle_uninterpreted; } if (input->ExpectTag(18)) goto parse_name; break; } // required string name = 2; case 2: { if (::google::protobuf::internal::WireFormatLite::GetTagWireType(tag) == ::google::protobuf::internal::WireFormatLite::WIRETYPE_LENGTH_DELIMITED) { parse_name: DO_(::google::protobuf::internal::WireFormatLite::ReadString( input, this->mutable_name())); ::google::protobuf::internal::WireFormat::VerifyUTF8String( this->name().data(), this->name().length(), ::google::protobuf::internal::WireFormat::PARSE); } else { goto handle_uninterpreted; } if (input->ExpectTag(24)) goto parse_grade; break; } // optional int32 grade = 3; case 3: { if (::google::protobuf::internal::WireFormatLite::GetTagWireType(tag) == ::google::protobuf::internal::WireFormatLite::WIRETYPE_VARINT) { parse_grade: DO_((::google::protobuf::internal::WireFormatLite::ReadPrimitive< ::google::protobuf::int32, ::google::protobuf::internal::WireFormatLite::TYPE_INT32>( input, &grade_))); set_has_grade(); } else { goto handle_uninterpreted; } if (input->ExpectAtEnd()) return true; break; } default: { handle_uninterpreted: if (::google::protobuf::internal::WireFormatLite::GetTagWireType(tag) == ::google::protobuf::internal::WireFormatLite::WIRETYPE_END_GROUP) { return true; } DO_(::google::protobuf::internal::WireFormat::SkipField( input, tag, mutable_unknown_fields())); break; } } } return true; #undef DO_ } void student::SerializeWithCachedSizes( ::google::protobuf::io::CodedOutputStream* output) const { // required int32 age = 1; if (has_age()) { ::google::protobuf::internal::WireFormatLite::WriteInt32(1, this->age(), output); } // required string name = 2; if (has_name()) { ::google::protobuf::internal::WireFormat::VerifyUTF8String( this->name().data(), this->name().length(), ::google::protobuf::internal::WireFormat::SERIALIZE); ::google::protobuf::internal::WireFormatLite::WriteString( 2, this->name(), output); } // optional int32 grade = 3; if (has_grade()) { ::google::protobuf::internal::WireFormatLite::WriteInt32(3, this->grade(), output); } if (!unknown_fields().empty()) { ::google::protobuf::internal::WireFormat::SerializeUnknownFields( unknown_fields(), output); } } ::google::protobuf::uint8* student::SerializeWithCachedSizesToArray( ::google::protobuf::uint8* target) const { // required int32 age = 1; if (has_age()) { target = ::google::protobuf::internal::WireFormatLite::WriteInt32ToArray(1, this->age(), target); } // required string name = 2; if (has_name()) { ::google::protobuf::internal::WireFormat::VerifyUTF8String( this->name().data(), this->name().length(), ::google::protobuf::internal::WireFormat::SERIALIZE); target = ::google::protobuf::internal::WireFormatLite::WriteStringToArray( 2, this->name(), target); } // optional int32 grade = 3; if (has_grade()) { target = ::google::protobuf::internal::WireFormatLite::WriteInt32ToArray(3, this->grade(), target); } if (!unknown_fields().empty()) { target = ::google::protobuf::internal::WireFormat::SerializeUnknownFieldsToArray( unknown_fields(), target); } return target; } int student::ByteSize() const { int total_size = 0; if (_has_bits_[0 / 32] & (0xffu << (0 % 32))) { // required int32 age = 1; if (has_age()) { total_size += 1 + ::google::protobuf::internal::WireFormatLite::Int32Size( this->age()); } // required string name = 2; if (has_name()) { total_size += 1 + ::google::protobuf::internal::WireFormatLite::StringSize( this->name()); } // optional int32 grade = 3; if (has_grade()) { total_size += 1 + ::google::protobuf::internal::WireFormatLite::Int32Size( this->grade()); } } if (!unknown_fields().empty()) { total_size += ::google::protobuf::internal::WireFormat::ComputeUnknownFieldsSize( unknown_fields()); } GOOGLE_SAFE_CONCURRENT_WRITES_BEGIN(); _cached_size_ = total_size; GOOGLE_SAFE_CONCURRENT_WRITES_END(); return total_size; } void student::MergeFrom(const ::google::protobuf::Message& from) { GOOGLE_CHECK_NE(&from, this); const student* source = ::google::protobuf::internal::dynamic_cast_if_available<const student*>( &from); if (source == NULL) { ::google::protobuf::internal::ReflectionOps::Merge(from, this); } else { MergeFrom(*source); } } void student::MergeFrom(const student& from) { GOOGLE_CHECK_NE(&from, this); if (from._has_bits_[0 / 32] & (0xffu << (0 % 32))) { if (from.has_age()) { set_age(from.age()); } if (from.has_name()) { set_name(from.name()); } if (from.has_grade()) { set_grade(from.grade()); } } mutable_unknown_fields()->MergeFrom(from.unknown_fields()); } void student::CopyFrom(const ::google::protobuf::Message& from) { if (&from == this) return; Clear(); MergeFrom(from); } void student::CopyFrom(const student& from) { if (&from == this) return; Clear(); MergeFrom(from); } bool student::IsInitialized() const { if ((_has_bits_[0] & 0x00000003) != 0x00000003) return false; return true; } void student::Swap(student* other) { if (other != this) { std::swap(age_, other->age_); std::swap(name_, other->name_); std::swap(grade_, other->grade_); std::swap(_has_bits_[0], other->_has_bits_[0]); _unknown_fields_.Swap(&other->_unknown_fields_); std::swap(_cached_size_, other->_cached_size_); } } ::google::protobuf::Metadata student::GetMetadata() const { protobuf_AssignDescriptorsOnce(); ::google::protobuf::Metadata metadata; metadata.descriptor = student_descriptor_; metadata.reflection = student_reflection_; return metadata; } // @@protoc_insertion_point(namespace_scope) } // namespace caffe // @@protoc_insertion_point(global_scope)
caffe.pb.cc
生成之后,我们编写一个读取文件caffeReader.cpp
#include "caffe.pb.h" #include<iostream> #include<ios> using namespace std; void InfoStudents(const caffe::student & stu){ cout<< "student info:"<<endl; cout<<"name: "<<stu.name()<<endl; cout<<"age: "<<stu.age()<<endl; cout<<"grade: "<<stu.grade()<<endl; } int main(void) { caffe::student stu; stu.set_age(18); stu.set_name("gongxijun"); stu.set_grade(146); InfoStudents(stu); return 0; }
编译完成之后,我们执行如下命令:
g++ caffeRead.cpp -o caffeReader caffe.pb.cc -I /usr/local/protobuf/include -L /usr/local/protobuf/lib -lprotobuf -pthread
生成二进制执行文件caffeReader.我们运行该文件就会显示如下信息:
知道了proto文件是如何使用之,再去caffe.proto中看看caffe中定义的结构体:
message 表示需要传输的的参数的结构体.
caffe.proto中保存的有二进制大文件Blob的结构信息:BlobProto, BlobProtoVector, Datum
BlobProto:
// Specifies the shape (dimensions) of a Blob. //用来表示图片的shape message BlobShape { repeated int64 dim = 1 [packed = true]; } message BlobProto { optional BlobShape shape = 7; //现在用来替换下面的num,channels ,height,width,推荐使用 repeated float data = 5 [packed = true]; //forward计算数据 repeated float diff = 6 [packed = true]; //backward的残差数据 repeated double double_data = 8 [packed = true]; //双精度forward计算数据 repeated double double_diff = 9 [packed = true]; //双精度backward的残差数据 // 4D dimensions -- deprecated. Use "shape" instead. optional int32 num = 1 [default = 0]; // 图片个数or 维度 optional int32 channels = 2 [default = 0]; //通过比如rgb便是3维 optional int32 height = 3 [default = 0]; //图片高度 optional int32 width = 4 [default = 0]; //图片宽度 }
BlobProtoVector:BlobProto的动态数组
message BlobProtoVector { repeated BlobProto blobs = 1; }
Datum是lmdb的数据格式,详情可以看这里 http://blog.csdn.net/u010668907/article/details/51834411
message Datum {//lmdb中的数据格式 optional int32 channels = 1;//图片通道数比如rgb的通道数为3 optional int32 height = 2;//图片的高度 optional int32 width = 3;//图片的宽度 // the actual image data, in bytes optional bytes data = 4;//图片的数据.比如rgb的三维数组格式 optional int32 label = 5;//这张图片对应的标签,或者这块图像对应的标签[比如20个分类转换成数字之后对应的[0~19] // Optionally, the datum could also hold float data. repeated float float_data = 6;//图片的数据,有时候图片在转换过程中会变成浮点型 // If true data contains an encoded image that need to be decoded optional bool encoded = 7 [default = false];//是否需要解码 }
剩下的大部分都是Layer部分和Net部分和Solver部分的参数,关于Blob和Layer,Net还有Solver这四个部分的关系:
blob作为贯穿整个框架的数据单元,Sovler通过sovler.prototxt【我觉得我可能需要说明一下:.proto和.prototxt的区别吧,这两个都是google protobuff的文件,.proto用来定义结构体参数,.prototxt用来相应的.proto中的结构体的初始化数据】配置初始化Net,然后Net通过调用trainval.prototxt这些参数,来调用对应的Layer,并将数据blob输入到相应的Layer中,Layer来对流入的数据进行计算处理,然后再将计算后的blob数据返回,通过Net流向下一个Layer,每次执行一次,Solver就会计数一次,然后调整learn_rate,descay_weith等权值,参考的是caffe论坛上的这个话题:这里附上链接
http://www.caffecn.cn/?/question/123&sort_key=agree_count&sort=DESC
相关文章推荐
- C#源码学习之---将数据库数据以XML文件格式保存
- CAFFE学习笔记(四)将自己的jpg数据转成lmdb格式
- 深度学习caffe平台--train_val.prototxt文件中数据层及参数详解
- 深度学习框架caffe源码学习(一) — caffe.proto
- 薛开宇学习笔记二之总结笔记--caffe imagenet训练中train_val.prototxt中数据层及其参数设置
- caffe源码学习——用python定义网络时,源代码生成prototxt文件的原理
- caffe源码深入学习3:更底层的数据信息存取与交换代码:syncedmem.hpp和syncedmem.cpp
- caffe学习之conver_imageset.bin的使用方法,caffe下图片转lmdb格式类型数据总结
- 深度学习(十三)caffe之训练数据格式
- caffe入门学习:从我们的数据转化成为caffe可以使用的数据格式
- 【深度学习】【caffe实用工具3】笔记25 Windows下caffe中将图像数据集合转换为DB(LMDB/LEVELDB)文件格式之convert_imageset
- 深度学习caffe平台--train_val.prototxt文件中数据层及参数详解
- caffe源码学习——1.熟悉protobuf,会读caffe.proto
- caffe用python加载数据,包含各类数据类型(LMDB,image,HDF5等共五种全部格式))--caffe学习(3)
- 深度学习(十三)caffe之训练数据格式
- CAFFE学习笔记(四)将自己的jpg数据转成lmdb格式
- 深度学习(十三)caffe之训练数据格式
- 深度学习(十三)caffe之训练数据格式
- 基于 Android NDK 的学习之旅-----数据传输一(基本数据类型和数组传输)(附源码)
- Android上GTalk以及Push机制的XMPP数据选择使用protobuf格式而非XML格式