Tensorflow 的变量经常令人迷惑,这里通过详细的分析彻底搞懂它。

1 三个主角函数

  1. tf.variable_scope(<scope_name>, <initializer>): 声明命名空间,防止变量之间发生冲突。举例:
with tf.variable_scope("foo"):
  with tf.variable_scope("bar"):
    v = tf.get_variable("v", [1])
    assert v.name == "foo/bar/v:0"
    # True
  with tf.variable_scope("bbb"):
    v1 = tf.Variable([1], name="v")
    assert v1.name == "foo/bbb/v:0"
    # True
  1. tf.get_variable(<name>, <shape>, <dtype>, <initializer>): 根据变量名创建或者获取变量。
  • 命名空间默认设置 reuse=False,执行创建操作并返回新的变量。举例:
with tf.variable_scope("foo"):
  v = tf.get_variable("v", [1])
  • 如果当前所在命名空间设置了复用(或共享)属性,即 tf.variable_scope(reuse=True) ,搜寻已存在的变量中的同名变量并获取。举例:
with tf.variable_scope("foo"):
  v = tf.get_variable("v", [1])
with tf.variable_scope("foo", reuse=True):
  v1 = tf.get_variable("v", [1])
  assert v1 == v
  # True
  • 如果搜索不到同名变量会报错,无法创建新的变量。举例:
with tf.variable_scope("foo", reuse=True):
  v = tf.get_variable("v", [1])
  #  Raises ValueError("... v does not exists ...").
  1. tf.Variable(<initial-value>, name=<optional-name>): 通过初始值来创建一个变量。注意:参数name=不可以省略。举例:
with tf.variable_scope("bbb"):
  v = tf.Variable([1], "v")
  assert v.name == "foo/bbb/v:0"
  # AssertionError: v.name=foo/bbb/Variable:0

2 最主要的区别

  • tf.get_variable() 创建的变量,name 属性值不可以相同;
with tf.variable_scope("foo"):
  v = tf.get_variable("v", [1])
  v1 = tf.get_variable("v", [1])
  #  Raises ValueError("... v already exists ...").
  • tf.Variable() 创建变量时,name 属性值允许重复(tf底层实现时,自动引入别名机制)。
with tf.variable_scope("foo"):
  v = tf.Variable("v", [1])
  v1 = tf.Variable("v", [1])
  # v.name = foo/v:0
  # v1.name = foo/v_1:0

3 共享

tf.get_variable()创建的变量可以用于共享。举例:

def conv_relu(input, kernel_shape, bias_shape):
  # Create variable named "weights".
  weights = tf.get_variable("weights", kernel_shape, initializer=tf.random_normal_initializer())
  # Create variable named "biases".
  biases = tf.get_variable("biases", bias_shape, initializer=tf.constant_intializer(0.0))
  conv = tf.nn.conv2d(input, weights, strides=[1, 1, 1, 1], padding='SAME')
  return tf.nn.relu(conv + biases)

def my_image_filter(input_images):
  with tf.variable_scope("conv1"):
    # Variables created here will be named "conv1/weights", "conv1/biases".
    relu1 = conv_relu(input_images, [5, 5, 32, 32], [32])
  with tf.variable_scope("conv2"):
    # Variables created here will be named "conv2/weights", "conv2/biases".
  return conv_relu(relu1, [5, 5, 32, 32], [32])

但如果你直接调用的话就会出错:

result1 = my_image_filter(image1)
result2 = my_image_filter(image2)
# Raises ValueError(... conv1/weights already exists ...)

你必须显示声明 scope.reuse_variables() or tf.variable_scope("image_filters", reuse=True):

with tf.variable_scope("image_filters") as scope:
  result1 = my_image_filter(image1)
  scope.reuse_variables()
  # 另一种写法:tf.get_variable_scope().resue_variables()
  result2 = my_image_filter(image2)

4 初始化

tf.get_variable(<name>, <shape>, <dtype>, <initializer>) 可以指定初始化器,常见的方法器有下面几种:

  • tf.constant_initializer(value) :常量初始化器;
  • tf.random_uniform_initializer(a, b):从a到b均匀初始化;
  • tf.random_normal_initializer(mean, stddev) :用所给平均值和标准差初始化均匀分布(正太分布);
  • tf.truncated_normal_initializer(mean, stddev, seed, dtype) :截断的正态分布。生成的值服从具有指定平均值和标准偏差的正态分布,如果生成的值大于平均值2个标准偏差的值则丢弃重新选择。这是神经网络权重和过滤器的推荐初始值。

另外我们可以在 tf.variable_scope(<scope_name>, <initializer>) 中指定初始化器,如果之后在tf.get_variable()不显式声明初始化器,那么就会默认使用 scope 的初始化器。

5 参考