原始图片制作LMDB格式文件
2018-02-06 15:37
483 查看
原始图片制作LMDB格式文件
前言:手头上有很多MNIST的手写体图片,在CSDN上下载的,学习caffe的手写体分类,感觉几个脚本运行一下它直接帮你把工作都做好了,所以研究一下如何制作LMDB文件,所以一步一步来的。我图片库是在 http://download.csdn.net/download/bless2015/9610008 下载的,都是纯图片,所以想把它制作成LMDB格式的文件,我这里经历了几个步骤:
1.重命名 testimage 里面的所有图片的名称,因为是测试集合,不想它全都分类好了,所以想重命名之后全部放在一个文件夹下面,比如在 数字1的文件夹下面的20.bmp图片我就将它重命名为 1_20.bmp,然后对每个文件夹进行这样子的操作最后将所有图片都放在一个文件夹下面,也就是 testimage文件夹下面。
我的重命名代码:
#coding=utf-8 import os path0 = '/home/michael/MNIST_Image/testimage/pic2/0' path1 = '/home/michael/MNIST_Image/testimage/pic2/1' path2 = '/home/michael/MNIST_Image/testimage/pic2/2' path3 = '/home/michael/MNIST_Image/testimage/pic2/3' path4 = '/home/michael/MNIST_Image/testimage/pic2/4' path5 = '/home/michael/MNIST_Image/testimage/pic2/5' path6 = '/home/michael/MNIST_Image/testimage/pic2/6' path7 = '/home/michael/MNIST_Image/testimage/pic2/7' path8 = '/home/michael/MNIST_Image/testimage/pic2/8' path9 = '/home/michael/MNIST_Image/testimage/pic2/9' count = 1 for file in os.listdir(path0): os.rename(os.path.join(path0,file),os.path.join(path0,"0_"+str(count)+".bmp")) count+=1 count = 1 for file in os.listdir(path1): os.rename(os.path.join(path1,file),os.path.join(path1,"1_"+str(count)+".bmp")) count+=1 count = 1 for file in os.listdir(path2): os.rename(os.path.join(path2,file),os.path.join(path2,"2_"+str(count)+".bmp")) count+=1 count = 1 for file in os.listdir(path3): os.rename(os.path.join(path3,file),os.path.join(path3,"3_"+str(count)+".bmp")) count+=1 count = 1 for file in os.listdir(path4): os.rename(os.path.join(path4,file),os.path.join(path4,"4_"+str(count)+".bmp")) count+=1 count = 1 for file in os.listdir(path5): os.rename(os.path.join(path5,file),os.path.join(path5,"5_"+str(count)+".bmp")) count+=1 count = 1 for file in os.listdir(path6): os.rename(os.path.join(path6,file),os.path.join(path6,"6_"+str(count)+".bmp")) count+=1 count = 1 for file in os.listdir(path7): os.rename(os.path.join(path7,file),os.path.join(path7,"7_"+str(count)+".bmp")) count+=1 count = 1 for file in os.listdir(path8): os.rename(os.path.join(path8,file),os.path.join(path8,"8_"+str(count)+".bmp")) count+=1 count = 1 for file in os.listdir(path9): os.rename(os.path.join(path9,file),os.path.join(path9,"9_"+str(count)+".bmp")) count+=1
然后我就分别把所有图片全部都拷贝到 testimage 的主目录下面了。
至于trainimage,这是我们的训练数据,它的图片的文件夹也就是它的标签了,所以 trainimage 我们不用动它,接下来,就是要将我们的图片转化格式为LMDB格式,转化的过程需要确定几个东西:
1.要转化的 trainimage 和 testimage 的路径。
2.转化之后生成的LMDB文件存放的路径。
3.需要 train.txt 和 test.txt,里面存放的数据是 目标文件路径和它的标签(我们之前重命名的时候已经让首字母成为它的标签了)。
这些东西都是要在脚本文件里写好,写好之后我们运行脚本文件就能帮我们自动生成了,所以接下来先介绍,如何生成 train.txt 和 test.txt 。
显先看 train.txt 的部分内容:
0/20.bmp 0 0/3334.bmp 0 0/15.bmp 0 0/123.bmp 0 0/4465.bmp 0 0/1181.bmp 0
再看 test.txt 的部分内容:
5_122.bmp 5 4_79.bmp 4 5_738.bmp 5 9_527.bmp 9 0_871.bmp 0 3_70.bmp 3 3_586.bmp 3 6_370.bmp 6
因为我们会指定我们的 testimage 和trainimage 的路径,所以我们的txt文件只需要有在当前路径下的文件名(路径)以及它的标签,因为我们的 testimage 没有再分出一级子目录,所以就直接文件名即可,这里生成两个 txt 是用python 代码来写的,比较简单的一些函数凑起来的。
这是我生 test.txt的代码:
# -*- coding: utf-8 -*- import os string=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/testimage") #我后来把文件放这里了,你们可以自己改一下。 fileHandle=open("test.txt","w") for name in string: fileHandle.write(name+' '+name[0:1]+'\n') fileHandle.close()
这是我生 train.txt的代码:
# -*- coding: utf-8 -*- import os string0=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/0") string1=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/1") string2=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/2") string3=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/3") string4=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/4") string5=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/5") string6=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/6") string7=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/7") string8=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/8") string9=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/9") fileHandle=open("train.txt","w") for name in string0: fileHandle.write('0'+'/'+name+' '+'0'+'\n') for name in string1: fileHandle.write('1'+'/'+name+' '+'1'+'\n') for name in string2: fileHandle.write('2'+'/'+name+' '+'2'+'\n') for name in string3: fileHandle.write('3'+'/'+name+' '+'3'+'\n') for name in string4: fileHandle.write('4'+'/'+name+' '+'4'+'\n') for name in string5: fileHandle.write('5'+'/'+name+' '+'5'+'\n') for name in string6: fileHandle.write('6'+'/'+name+' '+'6'+'\n') for name in string7: fileHandle.write('7'+'/'+name+' '+'7'+'\n') for name in string8: fileHandle.write('8'+'/'+name+' '+'8'+'\n') for name in string9: fileHandle.write('9'+'/'+name+' '+'9'+'\n') fileHandle.close()
然后就可以发现生成了两个txt文件,有标签又有文件,那接下来就是来生成LMDB文件了,我是用一下脚本来生成文件的,大家路径是需要自己改动的(默认caffe已经配置好了)
#!/usr/bin/env sh EXAMPLE=/home/michael/ML_caffe/DataSet/MNIST #输出文件夹,下面会有子目录 DATA=/home/michael/ML_caffe/DataSet/MNIST #我们的txt文件存放的位置 TOOLS=/home/michael/caffe/build/tools #我们的caffe工具集的位置 TRAIN_DATA_ROOT=/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/#训练集,有分文件夹 TEST_DATA_ROOT=/home/michael/ML_caffe/DataSet/MNIST/data/testimage/#测试集,没有分文件夹,混合 RESIZE=true #将图片重新设定大小 if $RESIZE; then RESIZE_HEIGHT=32 #可以修改为指定的大小 RESIZE_WIDTH=32 else RESIZE_HEIGHT=0 RESIZE_WIDTH=0 fi echo "Creating train lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset \ #调用工具 --resize_height=$RESIZE_HEIGHT \ --resize_width=$RESIZE_WIDTH \ #重新设定大小 --shuffle \ #乱序 $TRAIN_DATA_ROOT \ #训练集位置 $DATA/train.txt \ #信息,包含标签以及目录信息,用python生成即可 $EXAMPLE/train_lmdb #生成的LMDB文件存放位置 echo "Creating test lmdb..." GLOG_logtostderr=1 $TOOLS/convert_imageset \ --resize_height=$RESIZE_HEIGHT \ --resize_width=$RESIZE_WIDTH \ --shuffle \ $TEST_DATA_ROOT \ $DATA/test.txt \ $EXAMPLE/test_lmdb echo "Done."
如果路径什么的和之前需要的东西准备好了之后,就可以在当前文件夹下面终端运行:
bash mnist-lmdb.sh
即可生成我们所需要的LMDB文件了。
michael@ASUS:~/ML_caffe/DataSet/MNIST$ bash mnist-lmdb.sh Creating train lmdb... I0206 14:28:48.380004 5571 convert_imageset.cpp:86] Shuffling data I0206 14:28:48.765233 5571 convert_imageset.cpp:89] A total of 60000 images. I0206 14:28:48.765497 5571 db_lmdb.cpp:35] Opened lmdb /home/michael/ML_caffe/DataSet/MNIST/train_lmdb I0206 14:28:48.934583 5571 convert_imageset.cpp:147] Processed 1000 files. I0206 14:28:49.104089 5571 convert_imageset.cpp:147] Processed 2000 files. I0206 14:28:49.274312 5571 convert_imageset.cpp:147] Processed 3000 files. I0206 14:28:49.449018 5571 convert_imageset.cpp:147] Processed 4000 files. I0206 14:28:49.611840 5571 convert_imageset.cpp:147] Processed 5000 files. I0206 14:28:49.775197 5571 convert_imageset.cpp:147] Processed 6000 files. I0206 14:28:49.937268 5571 convert_imageset.cpp:147] Processed 7000 files. I0206 14:28:50.104672 5571 convert_imageset.cpp:147] Processed 8000 files. I0206 14:28:50.264158 5571 convert_imageset.cpp:147] Processed 9000 files. I0206 14:28:50.449117 5571 convert_imageset.cpp:147] Processed 10000 files. I0206 14:28:50.616402 5571 convert_imageset.cpp:147] Processed 11000 files. I0206 14:28:50.781594 5571 convert_imageset.cpp:147] Processed 12000 files. I0206 14:28:50.954336 5571 convert_imageset.cpp:147] Processed 13000 files. I0206 14:28:51.116982 5571 convert_imageset.cpp:147] Processed 14000 files. I0206 14:28:51.281421 5571 convert_imageset.cpp:147] Processed 15000 files. I0206 14:28:51.450546 5571 convert_imageset.cpp:147] Processed 16000 files. I0206 14:28:51.645671 5571 convert_imageset.cpp:147] Processed 17000 files. I0206 14:28:51.811952 5571 convert_imageset.cpp:147] Processed 18000 files. I0206 14:28:52.000439 5571 convert_imageset.cpp:147] Processed 19000 files. I0206 14:28:52.170938 5571 convert_imageset.cpp:147] Processed 20000 files. I0206 14:28:52.344597 5571 convert_imageset.cpp:147] Processed 21000 files. I0206 14:28:52.509713 5571 convert_imageset.cpp:147] Processed 22000 files. I0206 14:28:52.673687 5571 convert_imageset.cpp:147] Processed 23000 files. I0206 14:28:52.837333 5571 convert_imageset.cpp:147] Processed 24000 files. I0206 14:28:53.022176 5571 convert_imageset.cpp:147] Processed 25000 files. I0206 14:28:53.182229 5571 convert_imageset.cpp:147] Processed 26000 files. I0206 14:28:53.351642 5571 convert_imageset.cpp:147] Processed 27000 files. I0206 14:28:53.515982 5571 convert_imageset.cpp:147] Processed 28000 files. I0206 14:28:53.675796 5571 convert_imageset.cpp:147] Processed 29000 files. I0206 14:28:53.847538 5571 convert_imageset.cpp:147] Processed 30000 files. I0206 14:28:54.023780 5571 convert_imageset.cpp:147] Processed 31000 files. I0206 14:28:54.193473 5571 convert_imageset.cpp:147] Processed 32000 files. I0206 14:28:54.366469 5571 convert_imageset.cpp:147] Processed 33000 files. I0206 14:28:54.539541 5571 convert_imageset.cpp:147] Processed 34000 files. I0206 14:28:54.702636 5571 convert_imageset.cpp:147] Processed 35000 files. I0206 14:28:54.863446 5571 convert_imageset.cpp:147] Processed 36000 files. I0206 14:28:55.031626 5571 convert_imageset.cpp:147] Processed 37000 files. I0206 14:28:55.252513 5571 convert_imageset.cpp:147] Processed 38000 files. I0206 14:28:55.425963 5571 convert_imageset.cpp:147] Processed 39000 files. I0206 14:28:55.614948 5571 convert_imageset.cpp:147] Processed 40000 files. I0206 14:28:55.778008 5571 convert_imageset.cpp:147] Processed 41000 files. I0206 14:28:55.947497 5571 convert_imageset.cpp:147] Processed 42000 files. I0206 14:28:56.112609 5571 convert_imageset.cpp:147] Processed 43000 files. I0206 14:28:56.282855 5571 convert_imageset.cpp:147] Processed 44000 files. I0206 14:28:56.445716 5571 convert_imageset.cpp:147] Processed 45000 files. I0206 14:28:56.606652 5571 convert_imageset.cpp:147] Processed 46000 files. I0206 14:28:56.776607 5571 convert_imageset.cpp:147] Processed 47000 files. I0206 14:28:56.946336 5571 convert_imageset.cpp:147] Processed 48000 files. I0206 14:28:57.121255 5571 convert_imageset.cpp:147] Processed 49000 files. I0206 14:28:57.310395 5571 convert_imageset.cpp:147] Processed 50000 files. I0206 14:28:57.497797 5571 convert_imageset.cpp:147] Processed 51000 files. I0206 14:28:57.676450 5571 convert_imageset.cpp:147] Processed 52000 files. I0206 14:28:57.852885 5571 convert_imageset.cpp:147] Processed 53000 files. I0206 14:28:58.041313 5571 convert_imageset.cpp:147] Processed 54000 files. I0206 14:28:58.212401 5571 convert_imageset.cpp:147] Processed 55000 files. I0206 14:28:58.378213 5571 convert_imageset.cpp:147] Processed 56000 files. I0206 14:28:58.541512 5571 convert_imageset.cpp:147] Processed 57000 files. I0206 14:28:58.702957 5571 convert_imageset.cpp:147] Processed 58000 files. I0206 14:28:58.871085 5571 convert_imageset.cpp:147] Processed 59000 files. I0206 14:28:59.038836 5571 convert_imageset.cpp:147] Processed 60000 files. Creating val lmdb... I0206 14:28:59.116549 5579 convert_imageset.cpp:86] Shuffling data I0206 14:28:59.494279 5579 convert_imageset.cpp:89] A total of 10000 images. I0206 14:28:59.494462 5579 db_lmdb.cpp:35] Opened lmdb /home/michael/ML_caffe/DataSet/MNIST/test_lmdb I0206 14:28:59.567311 5579 convert_imageset.cpp:147] Processed 1000 files. I0206 14:28:59.638276 5579 convert_imageset.cpp:147] Processed 2000 files. I0206 14:28:59.728598 5579 convert_imageset.cpp:147] Processed 3000 files. I0206 14:28:59.802536 5579 convert_imageset.cpp:147] Processed 4000 files. I0206 14:28:59.876827 5579 convert_imageset.cpp:147] Processed 5000 files. I0206 14:28:59.963062 5579 convert_imageset.cpp:147] Processed 6000 files. I0206 14:29:00.042140 5579 convert_imageset.cpp:147] Processed 7000 files. I0206 14:29:00.135749 5579 convert_imageset.cpp:147] Processed 8000 files. I0206 14:29:00.210954 5579 convert_imageset.cpp:147] Processed 9000 files. I0206 14:29:00.285415 5579 convert_imageset.cpp:147] Processed 10000 files. Done. michael@ASUS:~/ML_caffe/DataSet/MNIST$
相关文章推荐
- 深度学习caffe平台--制作自己.lmdb格式数据集及分类标签文件
- 深度学习caffe平台--制作自己.lmdb格式数据集及分类标签文件
- 如何将你在FLASH 8.0里制作完成的动态图片保存成GIF格式的文件,我来教你!!!
- 如何将你在FLASH 8.0里制作完成的动态图片保存成GIF格式的文件,我来教你!!!
- 制作tensorflow标准数据集即制作.tfrecords格式文件
- 基于java的图片文件格式转换和线性缩放
- python模块之imghdr(识别不同格式的图片文件)
- PNG文件结构(PNG图片格式)详解
- C#.NET 上传图片时怎样限制文件格式
- asp.net 文件上传验证是否是真正的图片格式
- 将图片生成pkl格式的文件(多层循环)
- 使用BMfont制作含有"中文图片"的.fnt格式字体合图
- 程序自解压格式文件的制作
- yaffs格式根文件系统制作
- Android中.9.png格式的图片的制作
- VC++图片控件(Picture Control)显示资源位图(BMP)、文件位图(BMP)、其它格式文件图片(JPG\PNG\BMP)的方法
- 将Texture Packer制作的.pvr.ccz和.plist文件还原为多个原图--格式之后
- 文件编码,图片格式,ip,日志访问等工具类
- springboot搭建文件预览解决方案,支持目前主流格式office文件,txt文件,png,jpg等图片以及压缩文件的在线预览功能
- PE 文件格式详解0 图片示意