TensorFlow23: “恶作剧” --人脸检测
2017-05-17 00:25
197 查看
前面有一个帖《OpenCV检测场景内是否有移动物体》我用树莓派做了一个简单的Motion
Detection,放在卫生间的,它会在我上大号时自动播放音乐。
我一个人租房,几个盆友周末时常会找我玩,他们觉得我做的Motion Detection很垃圾。于是我就想恶搞一下,用TensorFlow做一个“人脸识别”,在我上大号时播放音乐,如果是别人就播放《张震讲鬼故事》(@xingCI说放屁声更搞)。
我的任务的训练一个模型可以区分“我”和“其它人”的脸。注意,上面“人脸识别”我是加引号的,其实并不是真正的人脸识别,充其量就是个图像分类。如果你要使用真正的人脸识别,可以试试现成的库OpenFace+dlib《使用OpenFace进行人脸识别》。
有人已经把TensorFlow移植到了树莓派,项目地址tensorflow-on-raspberry-pi。
本帖需要使用到两组数据:一组是包含我脸的图像,另一组包含其它人人脸的图像。
其它人人脸的收集
找一堆图片,只要不包含自己就行,然后使用OpenCV提取图像中的大脸。
提取图像中的人脸,我使用OpenCV,据说使用dlib效果更好。
other_peoples_faces.py:
[python]
view plain
copy
import cv2
import os
import sys
IMAGE_DIR = '图片目录路径'
OUTPUT_DIR = './other_people'
if not os.path.exists(OUTPUT_DIR):
os.makedirs(OUTPUT_DIR)
# http://blog.topspeedsnail.com/archives/10511
# wget https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_default.xml
face_haar = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
for (dirpath, dirnames, filenames) in os.walk(IMAGE_DIR):
for filename in filenames:
if filename.endswith('.jpg'):
image_path = os.path.join(dirpath, filename)
print('process: ', image_path)
img = cv2.imread(image_path)
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_haar.detectMultiScale(gray_image, 1.3, 5)
for face_x,face_y,face_w,face_h in faces:
face = img[face_y:face_y+face_h, face_x:face_x+face_w]
face = cv2.resize(face, (64, 64))
cv2.imshow("img", face)
cv2.imwrite(os.path.join(OUTPUT_DIR, filename), face)
key = cv2.waitKey(30) & 0xff
if key == 27:
sys.exit(0)
4万多图片,我只提取了1万张脸,应该够使了。
图像大小
64×64
上面是OpenCV做的人脸检测,有了这个数据集又可以反过来训练TensorFlow版本的人脸检测。
斗大熊的脸
给自己拍照1万张,这是我一次拍照最多的一回。
[python]
view plain
copy
import cv2
import os
import sys
OUTPUT_DIR = './my_faces'
if not os.path.exists(OUTPUT_DIR):
os.makedirs(OUTPUT_DIR)
face_haar = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
cam = cv2.VideoCapture(0)
count = 0
while True:
print(count)
if count < 10000:
_, img = cam.read()
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_haar.detectMultiScale(gray_image, 1.3, 5)
for face_x,face_y,face_w,face_h in faces:
face = img[face_y:face_y+face_h, face_x:face_x+face_w]
face = cv2.resize(face, (64, 64))
cv2.imshow('img', face)
cv2.imwrite(os.path.join(OUTPUT_DIR, str(count)+'.jpg'), face)
count += 1
key = cv2.waitKey(30) & 0xff
if key == 27:
break
else:
break
在镜头前摇头晃脑、摆pose,戴眼镜、耳机,仰天45,写代码,呲牙咧嘴,玩手机。。。一定要多样化,直到拍1万张大脸。
训练模型
训练数据有了,下面开始训练。
[python]
view plain
copy
import tensorflow as tf
import cv2
import numpy as np
import os
from sklearn.model_selection import train_test_split
import random
import sys
my_image_path = 'my_faces'
others_image_path = 'other_people'
image_data = []
label_data = []
def get_padding_size(image):
h, w, _ = image.shape
longest_edge = max(h, w)
top, bottom, left, right = (0, 0, 0, 0)
if h < longest_edge:
dh = longest_edge - h
top = dh // 2
bottom = dh - top
elif w < longest_edge:
dw = longest_edge - w
left = dw // 2
right = dw - left
else:
pass
return top, bottom, left, right
def read_data(img_path, image_h=64, image_w=64):
for filename in os.listdir(img_path):
if filename.endswith('.jpg'):
filepath = os.path.join(img_path, filename)
image = cv2.imread(filepath)
top, bottom, left, right = get_padding_size(image)
image_pad = cv2.copyMakeBorder(image, top , bottom, left, right, cv2.BORDER_CONSTANT, value=[0, 0, 0])
image = cv2.resize(image_pad, (image_h, image_w))
image_data.append(image)
label_data.append(img_path)
read_data(others_image_path)
read_data(my_image_path)
image_data = np.array(image_data)
label_data = np.array([[0,1] if label == 'my_faces' else [1,0] for label in label_data])
train_x, test_x, train_y, test_y = train_test_split(image_data, label_data, test_size=0.05, random_state=random.randint(0, 100))
# image (height=64 width=64 channel=3)
train_x = train_x.reshape(train_x.shape[0], 64, 64, 3)
test_x = test_x.reshape(test_x.shape[0], 64, 64, 3)
# nomalize
train_x = train_x.astype('float32') / 255.0
test_x = test_x.astype('float32') / 255.0
print(len(train_x), len(train_y))
print(len(test_x), len(test_y))
#############################################################
batch_size = 128
num_batch = len(train_x) // batch_size
X = tf.placeholder(tf.float32, [None, 64, 64, 3]) # 图片大小64x64 channel=3
Y = tf.placeholder(tf.float32, [None, 2])
keep_prob_5 = tf.placeholder(tf.float32)
keep_prob_75 = tf.placeholder(tf.float32)
def panda_joke_cnn():
W_c1 = tf.Variable(tf.random_normal([3, 3, 3, 32], stddev=0.01))
b_c1 = tf.Variable(tf.random_normal([32]))
conv1 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(X, W_c1, strides=[1, 1, 1, 1], padding='SAME'), b_c1))
conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
conv1 = tf.nn.dropout(conv1, keep_prob_5)
W_c2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01))
b_c2 = tf.Variable(tf.random_normal([64]))
conv2 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(conv1, W_c2, strides=[1, 1, 1, 1], padding='SAME'), b_c2))
conv2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
conv2 = tf.nn.dropout(conv2, keep_prob_5)
W_c3 = tf.Variable(tf.random_normal([3, 3, 64, 64], stddev=0.01))
b_c3 = tf.Variable(tf.random_normal([64]))
conv3 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(conv2, W_c3, strides=[1, 1, 1, 1], padding='SAME'), b_c3))
conv3 = tf.nn.max_pool(conv3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
conv3 = tf.nn.dropout(conv3, keep_prob_5)
# Fully connected layer
W_d = tf.Variable(tf.random_normal([8*16*32, 512], stddev=0.01))
b_d = tf.Variable(tf.random_normal([512]))
dense = tf.reshape(conv3, [-1, W_d.get_shape().as_list()[0]])
dense = tf.nn.relu(tf.add(tf.matmul(dense, W_d), b_d))
dense = tf.nn.dropout(dense, keep_prob_75)
W_out = tf.Variable(tf.random_normal([512, 2], stddev=0.01))
b_out = tf.Variable(tf.random_normal([2]))
out = tf.add(tf.matmul(dense, W_out), b_out)
return out
def train_cnn():
output = panda_joke_cnn()
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(output, Y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)
accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(output, 1), tf.argmax(Y, 1)), tf.float32))
tf.summary.scalar("loss", loss)
tf.summary.scalar("accuracy", accuracy)
merged_summary_op = tf.summary.merge_all()
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
summary_writer = tf.summary.FileWriter('./log', graph=tf.get_default_graph())
for e in range(50):
for i in range(num_batch):
batch_x = train_x[i*batch_size : (i+1)*batch_size]
batch_y = train_y[i*batch_size : (i+1)*batch_size]
_, loss_, summary = sess.run([optimizer, loss, merged_summary_op], feed_dict={X: batch_x, Y: batch_y, keep_prob_5:0.5, keep_prob_75: 0.75})
summary_writer.add_summary(summary, e*num_batch+i)
print(e*num_batch+i, loss_)
if (e*num_batch+i) % 100 == 0:
acc = accuracy.eval({X: test_x, Y: test_y, keep_prob_5:1.0, keep_prob_75: 1.0})
print(e*num_batch+i, acc)
# save model
if acc > 0.98:
saver.save(sess, "i_am_a_joke.model", global_step=e*num_batch+i)
sys.exit(0)
train_cnn()
准确率曲线:
下面要做的就是在树莓派上使用模型,代码示例:
[python]
view plain
copy
output = panda_joke_cnn()
predict = tf.argmax(output, 1)
saver = tf.train.Saver()
sess = tf.Session()
saver.restore(sess, tf.train.latest_checkpoint('.'))
def is_my_face(image):
res = sess.run(predict, feed_dict={X: [image/255.0], keep_prob_5:1.0, keep_prob_75: 1.0})
if res[0] == 1:
return True
else:
return False
face_haar = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
cam = cv2.VideoCapture(0)
while True:
_, img = cam.read()
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_haar.detectMultiScale(gray_image, 1.3, 5)
for face_x,face_y,face_w,face_h in faces:
face = img[face_y:face_y+face_h, face_x:face_x+face_w]
face = cv2.resize(face, (64, 64))
print(is_my_face(face))
cv2.imshow('img', face)
key = cv2.waitKey(30) & 0xff
if key == 27:
sys.exit(0)
sess.close()
总结:占用内存100多M,准确率还凑合,先用着。
Detection,放在卫生间的,它会在我上大号时自动播放音乐。
我一个人租房,几个盆友周末时常会找我玩,他们觉得我做的Motion Detection很垃圾。于是我就想恶搞一下,用TensorFlow做一个“人脸识别”,在我上大号时播放音乐,如果是别人就播放《张震讲鬼故事》(@xingCI说放屁声更搞)。
我的任务的训练一个模型可以区分“我”和“其它人”的脸。注意,上面“人脸识别”我是加引号的,其实并不是真正的人脸识别,充其量就是个图像分类。如果你要使用真正的人脸识别,可以试试现成的库OpenFace+dlib《使用OpenFace进行人脸识别》。
有人已经把TensorFlow移植到了树莓派,项目地址tensorflow-on-raspberry-pi。
准备数据
本帖需要使用到两组数据:一组是包含我脸的图像,另一组包含其它人人脸的图像。其它人人脸的收集
找一堆图片,只要不包含自己就行,然后使用OpenCV提取图像中的大脸。
提取图像中的人脸,我使用OpenCV,据说使用dlib效果更好。
other_peoples_faces.py:
[python]
view plain
copy
import cv2
import os
import sys
IMAGE_DIR = '图片目录路径'
OUTPUT_DIR = './other_people'
if not os.path.exists(OUTPUT_DIR):
os.makedirs(OUTPUT_DIR)
# http://blog.topspeedsnail.com/archives/10511
# wget https://raw.githubusercontent.com/opencv/opencv/master/data/haarcascades/haarcascade_frontalface_default.xml
face_haar = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
for (dirpath, dirnames, filenames) in os.walk(IMAGE_DIR):
for filename in filenames:
if filename.endswith('.jpg'):
image_path = os.path.join(dirpath, filename)
print('process: ', image_path)
img = cv2.imread(image_path)
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_haar.detectMultiScale(gray_image, 1.3, 5)
for face_x,face_y,face_w,face_h in faces:
face = img[face_y:face_y+face_h, face_x:face_x+face_w]
face = cv2.resize(face, (64, 64))
cv2.imshow("img", face)
cv2.imwrite(os.path.join(OUTPUT_DIR, filename), face)
key = cv2.waitKey(30) & 0xff
if key == 27:
sys.exit(0)
4万多图片,我只提取了1万张脸,应该够使了。
图像大小
64×64
上面是OpenCV做的人脸检测,有了这个数据集又可以反过来训练TensorFlow版本的人脸检测。
斗大熊的脸
给自己拍照1万张,这是我一次拍照最多的一回。
[python]
view plain
copy
import cv2
import os
import sys
OUTPUT_DIR = './my_faces'
if not os.path.exists(OUTPUT_DIR):
os.makedirs(OUTPUT_DIR)
face_haar = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
cam = cv2.VideoCapture(0)
count = 0
while True:
print(count)
if count < 10000:
_, img = cam.read()
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_haar.detectMultiScale(gray_image, 1.3, 5)
for face_x,face_y,face_w,face_h in faces:
face = img[face_y:face_y+face_h, face_x:face_x+face_w]
face = cv2.resize(face, (64, 64))
cv2.imshow('img', face)
cv2.imwrite(os.path.join(OUTPUT_DIR, str(count)+'.jpg'), face)
count += 1
key = cv2.waitKey(30) & 0xff
if key == 27:
break
else:
break
在镜头前摇头晃脑、摆pose,戴眼镜、耳机,仰天45,写代码,呲牙咧嘴,玩手机。。。一定要多样化,直到拍1万张大脸。
训练模型
训练数据有了,下面开始训练。
[python]
view plain
copy
import tensorflow as tf
import cv2
import numpy as np
import os
from sklearn.model_selection import train_test_split
import random
import sys
my_image_path = 'my_faces'
others_image_path = 'other_people'
image_data = []
label_data = []
def get_padding_size(image):
h, w, _ = image.shape
longest_edge = max(h, w)
top, bottom, left, right = (0, 0, 0, 0)
if h < longest_edge:
dh = longest_edge - h
top = dh // 2
bottom = dh - top
elif w < longest_edge:
dw = longest_edge - w
left = dw // 2
right = dw - left
else:
pass
return top, bottom, left, right
def read_data(img_path, image_h=64, image_w=64):
for filename in os.listdir(img_path):
if filename.endswith('.jpg'):
filepath = os.path.join(img_path, filename)
image = cv2.imread(filepath)
top, bottom, left, right = get_padding_size(image)
image_pad = cv2.copyMakeBorder(image, top , bottom, left, right, cv2.BORDER_CONSTANT, value=[0, 0, 0])
image = cv2.resize(image_pad, (image_h, image_w))
image_data.append(image)
label_data.append(img_path)
read_data(others_image_path)
read_data(my_image_path)
image_data = np.array(image_data)
label_data = np.array([[0,1] if label == 'my_faces' else [1,0] for label in label_data])
train_x, test_x, train_y, test_y = train_test_split(image_data, label_data, test_size=0.05, random_state=random.randint(0, 100))
# image (height=64 width=64 channel=3)
train_x = train_x.reshape(train_x.shape[0], 64, 64, 3)
test_x = test_x.reshape(test_x.shape[0], 64, 64, 3)
# nomalize
train_x = train_x.astype('float32') / 255.0
test_x = test_x.astype('float32') / 255.0
print(len(train_x), len(train_y))
print(len(test_x), len(test_y))
#############################################################
batch_size = 128
num_batch = len(train_x) // batch_size
X = tf.placeholder(tf.float32, [None, 64, 64, 3]) # 图片大小64x64 channel=3
Y = tf.placeholder(tf.float32, [None, 2])
keep_prob_5 = tf.placeholder(tf.float32)
keep_prob_75 = tf.placeholder(tf.float32)
def panda_joke_cnn():
W_c1 = tf.Variable(tf.random_normal([3, 3, 3, 32], stddev=0.01))
b_c1 = tf.Variable(tf.random_normal([32]))
conv1 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(X, W_c1, strides=[1, 1, 1, 1], padding='SAME'), b_c1))
conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
conv1 = tf.nn.dropout(conv1, keep_prob_5)
W_c2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01))
b_c2 = tf.Variable(tf.random_normal([64]))
conv2 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(conv1, W_c2, strides=[1, 1, 1, 1], padding='SAME'), b_c2))
conv2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
conv2 = tf.nn.dropout(conv2, keep_prob_5)
W_c3 = tf.Variable(tf.random_normal([3, 3, 64, 64], stddev=0.01))
b_c3 = tf.Variable(tf.random_normal([64]))
conv3 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2d(conv2, W_c3, strides=[1, 1, 1, 1], padding='SAME'), b_c3))
conv3 = tf.nn.max_pool(conv3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
conv3 = tf.nn.dropout(conv3, keep_prob_5)
# Fully connected layer
W_d = tf.Variable(tf.random_normal([8*16*32, 512], stddev=0.01))
b_d = tf.Variable(tf.random_normal([512]))
dense = tf.reshape(conv3, [-1, W_d.get_shape().as_list()[0]])
dense = tf.nn.relu(tf.add(tf.matmul(dense, W_d), b_d))
dense = tf.nn.dropout(dense, keep_prob_75)
W_out = tf.Variable(tf.random_normal([512, 2], stddev=0.01))
b_out = tf.Variable(tf.random_normal([2]))
out = tf.add(tf.matmul(dense, W_out), b_out)
return out
def train_cnn():
output = panda_joke_cnn()
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(output, Y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.001).minimize(loss)
accuracy = tf.reduce_mean(tf.cast(tf.equal(tf.argmax(output, 1), tf.argmax(Y, 1)), tf.float32))
tf.summary.scalar("loss", loss)
tf.summary.scalar("accuracy", accuracy)
merged_summary_op = tf.summary.merge_all()
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
summary_writer = tf.summary.FileWriter('./log', graph=tf.get_default_graph())
for e in range(50):
for i in range(num_batch):
batch_x = train_x[i*batch_size : (i+1)*batch_size]
batch_y = train_y[i*batch_size : (i+1)*batch_size]
_, loss_, summary = sess.run([optimizer, loss, merged_summary_op], feed_dict={X: batch_x, Y: batch_y, keep_prob_5:0.5, keep_prob_75: 0.75})
summary_writer.add_summary(summary, e*num_batch+i)
print(e*num_batch+i, loss_)
if (e*num_batch+i) % 100 == 0:
acc = accuracy.eval({X: test_x, Y: test_y, keep_prob_5:1.0, keep_prob_75: 1.0})
print(e*num_batch+i, acc)
# save model
if acc > 0.98:
saver.save(sess, "i_am_a_joke.model", global_step=e*num_batch+i)
sys.exit(0)
train_cnn()
准确率曲线:
下面要做的就是在树莓派上使用模型,代码示例:
[python]
view plain
copy
output = panda_joke_cnn()
predict = tf.argmax(output, 1)
saver = tf.train.Saver()
sess = tf.Session()
saver.restore(sess, tf.train.latest_checkpoint('.'))
def is_my_face(image):
res = sess.run(predict, feed_dict={X: [image/255.0], keep_prob_5:1.0, keep_prob_75: 1.0})
if res[0] == 1:
return True
else:
return False
face_haar = cv2.CascadeClassifier("haarcascade_frontalface_default.xml")
cam = cv2.VideoCapture(0)
while True:
_, img = cam.read()
gray_image = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_haar.detectMultiScale(gray_image, 1.3, 5)
for face_x,face_y,face_w,face_h in faces:
face = img[face_y:face_y+face_h, face_x:face_x+face_w]
face = cv2.resize(face, (64, 64))
print(is_my_face(face))
cv2.imshow('img', face)
key = cv2.waitKey(30) & 0xff
if key == 27:
sys.exit(0)
sess.close()
总结:占用内存100多M,准确率还凑合,先用着。
相关文章推荐
- tf23: “恶作剧” --人脸检测
- 人脸识别和检测 图文教程,VS2010+opencv3.2+freetype2.8
- 笔记_人脸检测框架(Viola-Jones Objects detection framwork)
- OpenCV自学笔记7:人脸检测 之 静态图像中的人脸检测
- 人脸识别,人脸关键点检测算法
- Opencv2.2 Python 人脸检测
- Core Image人脸检测(iOS5新特性学习之三)
- keras系列︱人脸表情分类与识别:opencv人脸检测+Keras情绪分类(四)
- Ello讲述Haar人脸检测:易懂、很详细、值得围观
- 人脸关键点检测
- Dlib机器学习库学习系列三人脸对齐(特征点检测)
- 人脸检测和识别及python实现系列(1)-- 环境配置和相关类库安装
- 人脸检测——准备非人脸
- OpenCV&Qt学习之四——OpenCV 实现人脸检测与相关知识整理
- 人脸检测代码详细解析
- 浅析人脸检测之Haar分类器方法
- 基于AdaBoost算法的人脸检测经典论文研究之一
- Android API教程:人脸检测(上)
- vs下opencv的人脸检测以及qt下dll的封装调用(四、qt下测试dll)
- 检测视频文件中的人脸