tensorflow clip_by_norm函数理解
2017-03-28 14:07
3263 查看
clip_by_norm
这里的clip_by_norm是指对梯度进行裁剪,通过控制梯度的最大范式,防止梯度爆炸的问题,是一种比较常用的梯度规约的方式。tensorflow中的clip_by_norm
示例
optimizer = tf.train.AdamOptimizer(learning_rate, beta1=0.5) grads = optimizer.compute_gradients(cost) for i, (g, v) in enumerate(grads): if g is not None: grads[i] = (tf.clip_by_norm(g, 5), v) # clip gradients train_op = optimizer.apply_gradients(grads)
上面是一段比较通用的定义梯度计算公式的代码,其中用到了
tf.clip_by_norm这个方法,下面是该函数的源码:
def clip_by_norm(t, clip_norm, axes=None, name=None): """Clips tensor values to a maximum L2-norm. Given a tensor `t`, and a maximum clip value `clip_norm`, this operation normalizes `t` so that its L2-norm is less than or equal to `clip_norm`, along the dimensions given in `axes`. Specifically, in the default case where all dimensions are used for calculation, if the L2-norm of `t` is already less than or equal to `clip_norm`, then `t` is not modified. If the L2-norm is greater than `clip_norm`, then this operation returns a tensor of the same type and shape as `t` with its values set to: `t * clip_norm / l2norm(t)` In this case, the L2-norm of the output tensor is `clip_norm`. As another example, if `t` is a matrix and `axes == [1]`, then each row of the output will have L2-norm equal to `clip_norm`. If `axes == [0]` instead, each column of the output will be clipped. This operation is typically used to clip gradients before applying them with an optimizer. Args: t: A `Tensor`. clip_norm: A 0-D (scalar) `Tensor` > 0. A maximum clipping value. axes: A 1-D (vector) `Tensor` of type int32 containing the dimensions to use for computing the L2-norm. If `None` (the default), uses all dimensions. name: A name for the operation (optional). Returns: A clipped `Tensor`. """ with ops.name_scope(name, "clip_by_norm", [t, clip_norm]) as name: t = ops.convert_to_tensor(t, name="t") # Calculate L2-norm, clip elements by ratio of clip_norm to L2-norm l2norm_inv = math_ops.rsqrt( math_ops.reduce_sum(t * t, axes, keep_dims=True)) tclip = array_ops.identity(t * clip_norm * math_ops.minimum( l2norm_inv, constant_op.constant(1.0, dtype=t.dtype) / clip_norm), name=name) return tclip
通过注解可以清晰的明白其作用在于将传入的梯度张量
t的L2范数进行了上限约束,约束值即为
clip_norm,如果
t的L2范数超过了
clip_norm,则变换为
t * clip_norm / l2norm(t),如此一来,变换后的
t的L2范数便小于等于
clip_norm了。
示例
下面我们通过一段代码来直观地展示该函数的作用。生成随机数
import numpy as np t = np.random.randint(low=0,high=5,size=10) t
array([1, 1, 3, 4, 2, 2, 1, 4, 2, 3])
计算L2范数
l2norm4t = np.linalg.norm(t) l2norm4t
8.0622577482985491
随机数规约
clip_norm = 5 transformed_t = t *clip_norm/l2norm4t transformed_t
array([ 0.62017367, 0.62017367, 1.86052102, 2.48069469, 1.24034735, 1.24034735, 0.62017367, 2.48069469, 1.24034735, 1.86052102])
验证
np.linalg.norm(transformed_t)
5.0
可以看出,该随机数序列的L2范数已经被规约为
clip_norm的值。
相关文章推荐
- tf.clip_by_value:将tensor中的0和NONE进行范围限制的函数
- TensorFlow tf.clip_by_value tf.select 绝对值用法!
- tf.clip_by_global_norm理解
- tf.clip_by_norm理解
- tensorflow squeeze函数理解
- 转载!tf.clip_by_global_norm理解
- tensorflow API: tf.clip_by_value
- tf.clip_by_global_norm理解
- GAN by Example using Keras on Tensorflow Backend
- tensorflow学习之常用函数总结:tensorflow.placeholder()函数
- Windows硬件系统函数 - ClipCursor, ClipCursorBynum
- tensorflow学习笔记(2):tf.clip_by_value,tf.expand_dims等函数的用法
- tensorflow学习之常用函数总结:tensorflow.argmax()函数
- tensorflow学习之常用函数总结:tensorflow.reduce_mean()函数
- Effective TensorFlow Chapter 2: 理解静态和动态的Tensor类型的形状
- tensorflow reduction_indices理解
- Tensorflow tf.placeholder函数
- Tensorflow Session graph Op 的理解
- ResNet-TensorFlow Model Zoo代码理解
- tensorflow学习之常用函数总结:tensorflow.cast()函数