TensorFlow除了可以用于一些基本的深度学习算法（*NN)之类，当然也可以用于最简单的线性回归。毕竟以后我们所有接触到的如Logistics Regression, Neural Network 等都是以最基本的线性回归（Linear Regression）为基础的。本篇主要从简单的线性回归来展示运用TensorFlow工具做模型的一般过程。

1.数据集获取与预处理

In the following data pairs
X = fires per 1000 housing units
Y = thefts per 1000 population
within the same Zip code in the Chicago metro area
Reference: U.S. Commission on Civil Rights

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import xlrd

# local data file directory
DATA_FILE = "data/fire_theft.xls"

# STEP 0: read in data from xls file
book = xlrd.open_workbook(DATA_FILE, encoding_override="utf-8")
sheet = book.sheet_by_index(0)
lst = [sheet.row_values(i) for i in range(1, sheet.nrows)]
data = np.asarray(lst)
n_samples = sheet.nrows - 1


2.数据集抽样可视化与模型选择

。对该数据集我们可以画一个简单的二维数据分布图：

# STEP: 1: plot the data
X_axis, Y_axis = data.T[0], data.T[1]
plt.plot(X_axis, Y_axis, 'bo', label='Real data')
plt.ylim([0, 150])
plt.xlim([0, 45])
plt.legend()
plt.show()


Y = weight * X + bias


3.定义placeholder

# STEP 2: create placeholder for input X(number of fire) and label Y(number of theft)
X = tf.placeholder(tf.float32, name="X")
Y = tf.placeholder(tf.float32, name="Y")


4. 定义Variable

Name placeholder Variable
type function Class

Stack Overflow上有个关于两者区别的讨论值得一看

# STEP 3: create Variables(weight and bias here), initialize to 0.
w = tf.Variable(0.0, name="weight")
b = tf.Variable(0.0, name="bias")


5. 构建模型

# STEP 4: construct model to predict Y
Y_Predict = X * w + b


6. 定义损失函数

# STEP 5: define loss function(use square error here)
# hubor_loss = tf.losses.huber_loss(Y, Y_Predict) for Hubor loss.
loss = tf.square(Y - Y_Predict, name="loss")


7. 定义Optimizer

# STEP 6: define optimizer(here we use Gradient Descent with learning rate of 0.001)


8. 初始化与模型训练

# define the epoch
epoch = 200
init = tf.initialize_all_variables()
# execute the model
with tf.Session() as sess:
# STEP 7: initailize the necessary variable, in this case w and b.
sess.run(init)

# STEP 8: traning the model
for i in range(epoch):
for x, y in data:
sess.run(optimizer, feed_dict={X: x, Y: y})

# STEP 9: output the values of w and b.
w_value, b_value = sess.run([w, b])


9. 效果评估与预测

# plot the predict line.
plt.plot(Y_axis, Y_axis * w_value + b_value, 'r', label='Predicted data')
plt.ylim([0, 150])
plt.xlim([0, 45])
plt.legend()
plt.show()