机器学习(九)

运行TensorFlow

Posted by Gavin on June 6, 2019

醉后不知天在水

满船清梦压星河

前言

TensorFlow是一个用于数值计算的强大开源库,非常适合大型机器学习项目。其背后原理很简单,先定义一个用来计算的图,然后执行计算。


基本操作

安装

pip3 install --upgrade tensorflow

检验是否安装成功:

python3 -c 'import tensorflow; print(tensorflow.__version__)'

创建一个会话图并执行

import tensorflow as tf

x = tf.Variable(3, name='x')
y = tf.Variable(4, name='y')
f = x*x*y + y + 2

sess = tf.Session()
sess.run(x.initializer)
sess.run(y.initializer)
result = sess.run(f)
print(result)
sess.close()

每次需要打开一个会话才能执行一个图的计算,每次都重复sess.run()有点笨拙,因而:

with tf.Session() as sess:
	x.initializer.run()
	y.initializer.run()
	result = f.eval()

这样在with模块中会有一个默认会话
除了手工为每个变量调用初始化器之外,还可以使用global_variables_initializer()来完成此工作。这会在图中创建一个节点,这个节点会在会话执行时初始化所有变量:

init = tf.global_variables_initializer()
with tf.Session() as sess:
	init.run()
	result = f.eval()

也可以使用InteractiveSession,它和常规会话不同之处在于,创建时会将自己设置为默认会话,因此无需使用with模块,但需要手动结束会话

sess = tf.InteractiveSession()
init.run()
result = f.eval()
sess.close()

管理图

我们所创建的所有节点都会被自动添加到图上,大部分情况下,这不是问题,不过有时候想要管理多个不依赖的图,则可以创建新的图,利用with模块将其临时设置为默认图:

x1 = tf.Variable(1)
x1.graph is tf.get_default_graph()#True
graph = tf.Graph()
with graph.as_default():
	x2 = tf.Variable(2)
x2.graph is graph#True
x2.graph is tf.get_default_graph()#False

节点值的生命周期

当求值一个节点时,TensorFlow会自动检测该节点的依赖节点,并先对这些节点进行求值。但是图每次执行时,所有节点值会被丢弃重新计算,但是变量值不一样,变量值从初始化开始会一直持续到会话结束。如果不希望重复求值,那么需要特殊指定:

w = tf.constant(3)
x = w + 2
y = x + 5
z = x * 3
with tf.Session() as sess:
	y_val, z_val = sess.run([y,z])

TensorFlow中的线性回归

TensorFlow中可以接受任意数量的输入,也可以产生任意数量的输出。比如加法和乘法接受两个输入,产生一个输出。常量和变量(源操作)没有输入。输入和输出都是张量。以加州房价数据为例,展示一次线性回归操作(利用正规方程求解):

housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m,1)),housing.data]
X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name='X')
y = tf.constant(housing.target.reshape(-1,1), dtype=tf.float32, name='y')
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)
with tf.Session() as sess:
	theta_value = theta.eval()

与利用Numpy直接求解方程相比,上述代码如果有GPU,会自动分发计算到GPU上去


梯度下降

手工计算梯度

n_epochs = 1000
learning_rate = 0.01
X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name='X')
y = tf.constant(housing.target.reshape(-1,1), dtype=tf.float32, name='y')
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
gradients = 2/m * tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * gradients)
init = tf.global_variables_initializer()
with tf.Session() as sess:
	sess.run(init)
	for epoch in range(n_epochs):
		if epoch % 100 == 0:
			print('Epoch ', epoch, 'MSE = ', mse.eval())
		sess.run(training_op)
	best_theta = theta.eval()

自动微分

gradients = tf.gradients(mse, [theta])[0]

优化器

n_epochs = 1000
learning_rate = 0.01
X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name='X')
y = tf.constant(housing.target.reshape(-1,1), dtype=tf.float32, name='y')
theta = tf.Variable(tf.random_uniform([n+1, 1], -1.0, 1.0), name='theta')
y_pred = tf.matmul(X, theta, name='predictions')
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
#gradients = 2/m * tf.matmul(tf.transpose(X), error)
gradients = tf.gradients(mse, [theta])[0]
#training_op = tf.assign(theta, theta - learning_rate * gradients)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
training_op = optimizer.minimize(mse)
init = tf.global_variables_initializer()
with tf.Session() as sess:
	sess.run(init)
	for epoch in range(n_epochs):
		if epoch % 100 == 0:
			print('Epoch ', epoch, 'MSE = ', mse.eval())
		sess.run(training_op)
	best_theta = theta.eval()

提供数据

小批量梯度下降

每次迭代时用小批量X、y替换的方法,最简单是创建占位符节点。占位符节点不进行实际的计算,只负责输出我们想要输出的值。

A = tf.placeholder(tf.float32, shape=[None, 3])
B = A + 5
with tf.Session() as sess:
	B_val_1 = B.eval(feed_dict={A: [[1,2,3]]})
	B_val_2 = B.eval(feed_dict={A: [[4,5,6],[7,8,9]]})

模版:

#define placeholder
X = tf.placeholder(tf.float32, shape=(None,n + 1), name='X')
y = rf.placeholder(tf.float32, shape=(None,1), name='y')
#define batches
batche_size = 100
n_batches = int(np.ceil(n / batche_size))
#define graph
def fetch_batch(epoch, batch_index, batche_size):
	[...]#load data from disk
	return X_batche, y_batche
init = tf.global_variables_initializer()
with tf.Session() as sess:
	sess.run(init)
	for epoch in range(n_batches):
		X_batche, y_batche = fetch_batch(epoch, batch_index, batche_size)
		sess.run(training_op, feed_dict={X: X_batche, y: y_batche})
	best_theta = theta.eval()

保存和恢复模型

训练过程可能需要定期将检查点保存起来,这样在崩溃时,可以从最近的一个检查点恢复,而不是从头再来。

saver = tf.train.Saver()
...
#save model
saver.save(sess, 'path')
#restore model
saver.restore(sess, 'path')

命名作用域

with tf.name_scope('loss') as scope:
	error = y_pred - y
	tf.reduce_mean(tf.square(error), name='mse')

模块化

def relu(X):
	w_shape = (int(X.get_shape()[1]), 1)
	w = tf.Variable(tf.random_normal(w_shape), name='weights')
	b = tf.Variable(0.0, name='bias')
	z = tf.add(tf.matmul(X, w), b, name='z')
	return tf.maximum(z, 0, name='relu')

n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name='X')
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name='output')