您的位置:首页 > 其它

原始图片制作LMDB格式文件

2018-02-06 15:37 483 查看

原始图片制作LMDB格式文件

前言:手头上有很多MNIST的手写体图片,在CSDN上下载的,学习caffe的手写体分类,感觉几个脚本运行一下它直接帮你把工作都做好了,所以研究一下如何制作LMDB文件,所以一步一步来的。

我图片库是在 http://download.csdn.net/download/bless2015/9610008 下载的,都是纯图片,所以想把它制作成LMDB格式的文件,我这里经历了几个步骤:



1.重命名 testimage 里面的所有图片的名称,因为是测试集合,不想它全都分类好了,所以想重命名之后全部放在一个文件夹下面,比如在 数字1的文件夹下面的20.bmp图片我就将它重命名为 1_20.bmp,然后对每个文件夹进行这样子的操作最后将所有图片都放在一个文件夹下面,也就是 testimage文件夹下面。

我的重命名代码:

#coding=utf-8
import os
path0 = '/home/michael/MNIST_Image/testimage/pic2/0'
path1 = '/home/michael/MNIST_Image/testimage/pic2/1'
path2 = '/home/michael/MNIST_Image/testimage/pic2/2'
path3 = '/home/michael/MNIST_Image/testimage/pic2/3'
path4 = '/home/michael/MNIST_Image/testimage/pic2/4'
path5 = '/home/michael/MNIST_Image/testimage/pic2/5'
path6 = '/home/michael/MNIST_Image/testimage/pic2/6'
path7 = '/home/michael/MNIST_Image/testimage/pic2/7'
path8 = '/home/michael/MNIST_Image/testimage/pic2/8'
path9 = '/home/michael/MNIST_Image/testimage/pic2/9'
count = 1
for file in os.listdir(path0):
os.rename(os.path.join(path0,file),os.path.join(path0,"0_"+str(count)+".bmp"))
count+=1
count = 1
for file in os.listdir(path1):
os.rename(os.path.join(path1,file),os.path.join(path1,"1_"+str(count)+".bmp"))
count+=1
count = 1
for file in os.listdir(path2):
os.rename(os.path.join(path2,file),os.path.join(path2,"2_"+str(count)+".bmp"))
count+=1
count = 1
for file in os.listdir(path3):
os.rename(os.path.join(path3,file),os.path.join(path3,"3_"+str(count)+".bmp"))
count+=1
count = 1
for file in os.listdir(path4):
os.rename(os.path.join(path4,file),os.path.join(path4,"4_"+str(count)+".bmp"))
count+=1
count = 1
for file in os.listdir(path5):
os.rename(os.path.join(path5,file),os.path.join(path5,"5_"+str(count)+".bmp"))
count+=1
count = 1
for file in os.listdir(path6):
os.rename(os.path.join(path6,file),os.path.join(path6,"6_"+str(count)+".bmp"))
count+=1
count = 1
for file in os.listdir(path7):
os.rename(os.path.join(path7,file),os.path.join(path7,"7_"+str(count)+".bmp"))
count+=1
count = 1
for file in os.listdir(path8):
os.rename(os.path.join(path8,file),os.path.join(path8,"8_"+str(count)+".bmp"))
count+=1
count = 1
for file in os.listdir(path9):
os.rename(os.path.join(path9,file),os.path.join(path9,"9_"+str(count)+".bmp"))
count+=1




然后我就分别把所有图片全部都拷贝到 testimage 的主目录下面了。

至于trainimage,这是我们的训练数据,它的图片的文件夹也就是它的标签了,所以 trainimage 我们不用动它,接下来,就是要将我们的图片转化格式为LMDB格式,转化的过程需要确定几个东西:

1.要转化的 trainimage 和 testimage 的路径。

2.转化之后生成的LMDB文件存放的路径。

3.需要 train.txt 和 test.txt,里面存放的数据是 目标文件路径和它的标签(我们之前重命名的时候已经让首字母成为它的标签了)。

这些东西都是要在脚本文件里写好,写好之后我们运行脚本文件就能帮我们自动生成了,所以接下来先介绍,如何生成  train.txt 和 test.txt 。

显先看 train.txt 的部分内容:

0/20.bmp 0
0/3334.bmp 0
0/15.bmp 0
0/123.bmp 0
0/4465.bmp 0
0/1181.bmp 0


再看 test.txt 的部分内容:

5_122.bmp 5
4_79.bmp 4
5_738.bmp 5
9_527.bmp 9
0_871.bmp 0
3_70.bmp 3
3_586.bmp 3
6_370.bmp 6


因为我们会指定我们的 testimage 和trainimage 的路径,所以我们的txt文件只需要有在当前路径下的文件名(路径)以及它的标签,因为我们的 testimage 没有再分出一级子目录,所以就直接文件名即可,这里生成两个 txt 是用python 代码来写的,比较简单的一些函数凑起来的。

这是我生 test.txt的代码:

# -*- coding: utf-8 -*-
import os
string=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/testimage")
#我后来把文件放这里了,你们可以自己改一下。
fileHandle=open("test.txt","w")
for name in string:
fileHandle.write(name+' '+name[0:1]+'\n')
fileHandle.close()


这是我生 train.txt的代码:

# -*- coding: utf-8 -*-
import os
string0=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/0")
string1=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/1")
string2=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/2")
string3=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/3")
string4=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/4")
string5=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/5")
string6=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/6")
string7=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/7")
string8=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/8")
string9=os.listdir("/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/9")
fileHandle=open("train.txt","w")
for name in string0:
fileHandle.write('0'+'/'+name+' '+'0'+'\n')
for name in string1:
fileHandle.write('1'+'/'+name+' '+'1'+'\n')
for name in string2:
fileHandle.write('2'+'/'+name+' '+'2'+'\n')
for name in string3:
fileHandle.write('3'+'/'+name+' '+'3'+'\n')
for name in string4:
fileHandle.write('4'+'/'+name+' '+'4'+'\n')
for name in string5:
fileHandle.write('5'+'/'+name+' '+'5'+'\n')
for name in string6:
fileHandle.write('6'+'/'+name+' '+'6'+'\n')
for name in string7:
fileHandle.write('7'+'/'+name+' '+'7'+'\n')
for name in string8:
fileHandle.write('8'+'/'+name+' '+'8'+'\n')
for name in string9:
fileHandle.write('9'+'/'+name+' '+'9'+'\n')
fileHandle.close()


然后就可以发现生成了两个txt文件,有标签又有文件,那接下来就是来生成LMDB文件了,我是用一下脚本来生成文件的,大家路径是需要自己改动的(默认caffe已经配置好了)

#!/usr/bin/env sh
EXAMPLE=/home/michael/ML_caffe/DataSet/MNIST #输出文件夹,下面会有子目录
DATA=/home/michael/ML_caffe/DataSet/MNIST #我们的txt文件存放的位置
TOOLS=/home/michael/caffe/build/tools #我们的caffe工具集的位置

TRAIN_DATA_ROOT=/home/michael/ML_caffe/DataSet/MNIST/data/trainimage/#训练集,有分文件夹
TEST_DATA_ROOT=/home/michael/ML_caffe/DataSet/MNIST/data/testimage/#测试集,没有分文件夹,混合

RESIZE=true #将图片重新设定大小
if $RESIZE; then
RESIZE_HEIGHT=32 #可以修改为指定的大小
RESIZE_WIDTH=32
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi

echo "Creating train lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \ #调用工具
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \ #重新设定大小
--shuffle \ #乱序
$TRAIN_DATA_ROOT \  #训练集位置
$DATA/train.txt \ #信息,包含标签以及目录信息,用python生成即可
$EXAMPLE/train_lmdb #生成的LMDB文件存放位置

echo "Creating test lmdb..."

GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$TEST_DATA_ROOT \
$DATA/test.txt \
$EXAMPLE/test_lmdb

echo "Done."


如果路径什么的和之前需要的东西准备好了之后,就可以在当前文件夹下面终端运行:

bash mnist-lmdb.sh


即可生成我们所需要的LMDB文件了。

michael@ASUS:~/ML_caffe/DataSet/MNIST$ bash mnist-lmdb.sh
Creating train lmdb...
I0206 14:28:48.380004  5571 convert_imageset.cpp:86] Shuffling data
I0206 14:28:48.765233  5571 convert_imageset.cpp:89] A total of 60000 images.
I0206 14:28:48.765497  5571 db_lmdb.cpp:35] Opened lmdb /home/michael/ML_caffe/DataSet/MNIST/train_lmdb
I0206 14:28:48.934583  5571 convert_imageset.cpp:147] Processed 1000 files.
I0206 14:28:49.104089  5571 convert_imageset.cpp:147] Processed 2000 files.
I0206 14:28:49.274312  5571 convert_imageset.cpp:147] Processed 3000 files.
I0206 14:28:49.449018  5571 convert_imageset.cpp:147] Processed 4000 files.
I0206 14:28:49.611840  5571 convert_imageset.cpp:147] Processed 5000 files.
I0206 14:28:49.775197  5571 convert_imageset.cpp:147] Processed 6000 files.
I0206 14:28:49.937268  5571 convert_imageset.cpp:147] Processed 7000 files.
I0206 14:28:50.104672  5571 convert_imageset.cpp:147] Processed 8000 files.
I0206 14:28:50.264158  5571 convert_imageset.cpp:147] Processed 9000 files.
I0206 14:28:50.449117  5571 convert_imageset.cpp:147] Processed 10000 files.
I0206 14:28:50.616402  5571 convert_imageset.cpp:147] Processed 11000 files.
I0206 14:28:50.781594  5571 convert_imageset.cpp:147] Processed 12000 files.
I0206 14:28:50.954336  5571 convert_imageset.cpp:147] Processed 13000 files.
I0206 14:28:51.116982  5571 convert_imageset.cpp:147] Processed 14000 files.
I0206 14:28:51.281421  5571 convert_imageset.cpp:147] Processed 15000 files.
I0206 14:28:51.450546  5571 convert_imageset.cpp:147] Processed 16000 files.
I0206 14:28:51.645671  5571 convert_imageset.cpp:147] Processed 17000 files.
I0206 14:28:51.811952  5571 convert_imageset.cpp:147] Processed 18000 files.
I0206 14:28:52.000439  5571 convert_imageset.cpp:147] Processed 19000 files.
I0206 14:28:52.170938  5571 convert_imageset.cpp:147] Processed 20000 files.
I0206 14:28:52.344597  5571 convert_imageset.cpp:147] Processed 21000 files.
I0206 14:28:52.509713  5571 convert_imageset.cpp:147] Processed 22000 files.
I0206 14:28:52.673687  5571 convert_imageset.cpp:147] Processed 23000 files.
I0206 14:28:52.837333  5571 convert_imageset.cpp:147] Processed 24000 files.
I0206 14:28:53.022176  5571 convert_imageset.cpp:147] Processed 25000 files.
I0206 14:28:53.182229  5571 convert_imageset.cpp:147] Processed 26000 files.
I0206 14:28:53.351642  5571 convert_imageset.cpp:147] Processed 27000 files.
I0206 14:28:53.515982  5571 convert_imageset.cpp:147] Processed 28000 files.
I0206 14:28:53.675796  5571 convert_imageset.cpp:147] Processed 29000 files.
I0206 14:28:53.847538  5571 convert_imageset.cpp:147] Processed 30000 files.
I0206 14:28:54.023780  5571 convert_imageset.cpp:147] Processed 31000 files.
I0206 14:28:54.193473  5571 convert_imageset.cpp:147] Processed 32000 files.
I0206 14:28:54.366469  5571 convert_imageset.cpp:147] Processed 33000 files.
I0206 14:28:54.539541  5571 convert_imageset.cpp:147] Processed 34000 files.
I0206 14:28:54.702636  5571 convert_imageset.cpp:147] Processed 35000 files.
I0206 14:28:54.863446  5571 convert_imageset.cpp:147] Processed 36000 files.
I0206 14:28:55.031626  5571 convert_imageset.cpp:147] Processed 37000 files.
I0206 14:28:55.252513  5571 convert_imageset.cpp:147] Processed 38000 files.
I0206 14:28:55.425963  5571 convert_imageset.cpp:147] Processed 39000 files.
I0206 14:28:55.614948  5571 convert_imageset.cpp:147] Processed 40000 files.
I0206 14:28:55.778008  5571 convert_imageset.cpp:147] Processed 41000 files.
I0206 14:28:55.947497  5571 convert_imageset.cpp:147] Processed 42000 files.
I0206 14:28:56.112609  5571 convert_imageset.cpp:147] Processed 43000 files.
I0206 14:28:56.282855  5571 convert_imageset.cpp:147] Processed 44000 files.
I0206 14:28:56.445716  5571 convert_imageset.cpp:147] Processed 45000 files.
I0206 14:28:56.606652  5571 convert_imageset.cpp:147] Processed 46000 files.
I0206 14:28:56.776607  5571 convert_imageset.cpp:147] Processed 47000 files.
I0206 14:28:56.946336  5571 convert_imageset.cpp:147] Processed 48000 files.
I0206 14:28:57.121255  5571 convert_imageset.cpp:147] Processed 49000 files.
I0206 14:28:57.310395  5571 convert_imageset.cpp:147] Processed 50000 files.
I0206 14:28:57.497797  5571 convert_imageset.cpp:147] Processed 51000 files.
I0206 14:28:57.676450  5571 convert_imageset.cpp:147] Processed 52000 files.
I0206 14:28:57.852885  5571 convert_imageset.cpp:147] Processed 53000 files.
I0206 14:28:58.041313  5571 convert_imageset.cpp:147] Processed 54000 files.
I0206 14:28:58.212401  5571 convert_imageset.cpp:147] Processed 55000 files.
I0206 14:28:58.378213  5571 convert_imageset.cpp:147] Processed 56000 files.
I0206 14:28:58.541512  5571 convert_imageset.cpp:147] Processed 57000 files.
I0206 14:28:58.702957  5571 convert_imageset.cpp:147] Processed 58000 files.
I0206 14:28:58.871085  5571 convert_imageset.cpp:147] Processed 59000 files.
I0206 14:28:59.038836  5571 convert_imageset.cpp:147] Processed 60000 files.
Creating val lmdb...
I0206 14:28:59.116549  5579 convert_imageset.cpp:86] Shuffling data
I0206 14:28:59.494279  5579 convert_imageset.cpp:89] A total of 10000 images.
I0206 14:28:59.494462  5579 db_lmdb.cpp:35] Opened lmdb /home/michael/ML_caffe/DataSet/MNIST/test_lmdb
I0206 14:28:59.567311  5579 convert_imageset.cpp:147] Processed 1000 files.
I0206 14:28:59.638276  5579 convert_imageset.cpp:147] Processed 2000 files.
I0206 14:28:59.728598  5579 convert_imageset.cpp:147] Processed 3000 files.
I0206 14:28:59.802536  5579 convert_imageset.cpp:147] Processed 4000 files.
I0206 14:28:59.876827  5579 convert_imageset.cpp:147] Processed 5000 files.
I0206 14:28:59.963062  5579 convert_imageset.cpp:147] Processed 6000 files.
I0206 14:29:00.042140  5579 convert_imageset.cpp:147] Processed 7000 files.
I0206 14:29:00.135749  5579 convert_imageset.cpp:147] Processed 8000 files.
I0206 14:29:00.210954  5579 convert_imageset.cpp:147] Processed 9000 files.
I0206 14:29:00.285415  5579 convert_imageset.cpp:147] Processed 10000 files.
Done.
michael@ASUS:~/ML_caffe/DataSet/MNIST$


内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: