您的位置：首页 > 其它

深度学习实验记录

2017-10-03 11:46 288 查看

#**IMPORTANT**
#Pleasenotethatthislearningratescheduleisheavilydependentonthe
#hardwarearchitecture,batchsizeandanychangestothemodelarchitecture
#specification.Selectingafinelytunedlearningratescheduleisan
#empiricalprocessthatrequiressomeexperimentation.PleaseseeREADME.md
#moreguidanceanddiscussion.
#
#With8TeslaK40'sandabatchsize=256,thefollowingsetupachieves
#precision@1=73.5%after100hoursand100Ksteps(20epochs).
#Learningratedecayfactorselectedfromhttp://arxiv.org/abs/1404.5997.
打开TensorBoard:tensorboard--logdir=/tmp/imagenet_train

imagenet训练数据1000k,Inceptionv3network在1060上训练batch_size=32，32examples/sec，

20小时跑了70kstep后共训练数据32*70k=2100k,2epochs的训练数据，loss从13降到8，并且降低的趋势走平了。

55小时跑了204kstep后共训练数据32*204k=6400k,6epochs的训练数据，loss从13降到7，从120k开始趋势接近平了。

4天1小时（97H）跑了360kstep后共训练数据32*360k=10000k,10epochs的训练数据，loss还是7左右,loss从120k开始趋势接近平了。

Eval:precision@1=0.5584recall@5=0.8052[50016examples]

类似的问题：https://stackoverflow.com/questions/38259166/training-tensorflow-inception-v3-imagenet-on-modest-hardware-setup他也没达到最优：

2016-06-0612:07:52.245005:precision@1=0.5767recall@5=0.8143[50016examples]
2016-06-0922:35:10.118852:precision@1=0.5957recall@5=0.8294[50016examples]
2016-06-1415:30:59.532629:precision@1=0.6112recall@5=0.8396[50016examples]
2016-06-2013:57:14.025797:precision@1=0.6136recall@5=0.8423[50016examples]

Onasmallhardwaresetuplikeyours,itwillbedifficulttoachievemaximumperformance.GenerallyspeakingforCNN's,thebestperformanceiswiththelargestbatchsizespossible.ThismeansthatforCNN'sthetrainingprocedureisoftenlimitedbythemaximumbatchsizethatcanfitinGPUmemory.

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签：

相关文章推荐

新的分享

章节导航