2 years ago

## 线性模型和梯度下降

### 一元线性回归

$$\hat{y}_i$$是我们预测的结果，希望通过$$\hat{y}_i$$来拟合目标$$y_i$$， 通俗来讲就是找到这个函数拟合$$y_i$$使得误差最小，即最小化$$\frac{1}{n}\sum_{i=1}^n(\hat{y}_i - y_i)^2$$

### 梯度下降法

#### 梯度下降法

In [1]:
import torch
import numpy as np
torch.manual_seed(2018)
Out[1]:
<torch._C.Generator at 0x7f603c0619b0>
In [9]:
# 读入数据
x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168],
[9.779], [6.182], [7.59], [2.167], [7.042],
[10.791], [5.313], [7.997], [3.1]], dtype=np.float32)

y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573],
[3.366], [2.596], [2.53], [1.221], [2.827],
[3.465], [1.65], [2.904], [1.3]], dtype=np.float32)
In [10]:
# 画出图像
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(x_train, y_train, 'bo')
Out[10]:
[<matplotlib.lines.Line2D at 0x7f5fe483fd30>]
In [11]:
# 转换成Tensor
x_train = torch.from_numpy(x_train)
y_train = torch.from_numpy(y_train)

# 定义参数w和b
# 随机初始化
# 初始化0
b = Variable(torch.zeros(1), requires_grad=True)
In [12]:
# 构建线性回归模型
x_train = Variable(x_train)
y_train = Variable(y_train)

def linear_model(X):
return X * w + b
In [13]:
y_ = linear_model(x_train)

In [15]:
plt.plot(x_train.data.numpy(), y_train.data.numpy(), 'bo', label='real')
plt.plot(x_train.data.numpy(), y_.data.numpy(), 'ro', label='estimated')
plt.legend()

Out[15]:
<matplotlib.legend.Legend at 0x7f5fe47c58d0>

In [16]:
# 计算误差
def get_loss(y_, y):
loss = get_loss(y_, y_train)

In [17]:
print(loss)
Variable containing: 0.6740 [torch.FloatTensor of size 1] 

In [18]:
# 自动求导
loss.backward()

In [19]:
# 查看w, b梯度

Variable containing: 8.5503 [torch.FloatTensor of size 1] Variable containing: 1.0291 [torch.FloatTensor of size 1] 
In [20]:
# 更新一次参数
w.data = w.data - 1e-2 * w.grad.data
b.data = b.data - 1e-2 * b.grad.data
In [21]:
# 更新完参数后，我们再一次看看模型输出的结果
y_ = linear_model(x_train)
plt.plot(x_train.data.numpy(),y_train.data.numpy(), 'bo', label='real')
plt.plot(x_train.data.numpy(),y_.data.numpy(), 'ro', label='estimated')
plt.legend()

Out[21]:
<matplotlib.legend.Legend at 0x7f5fe4831748>

In [22]:
for epoch in range(10):
y_ = linear_model(x_train)
loss = get_loss(y_, y_train)
# 归零梯度
loss.backward()

print('epoch: {}, loss: {}'.format(epoch, loss.data[0]))
epoch: 0, loss: 0.2525334060192108 epoch: 1, loss: 0.244352787733078 epoch: 2, loss: 0.2438223510980606 epoch: 3, loss: 0.24343542754650116 epoch: 4, loss: 0.2430531084537506 epoch: 5, loss: 0.24267280101776123 epoch: 6, loss: 0.24229441583156586 epoch: 7, loss: 0.24191799759864807 epoch: 8, loss: 0.2415434867143631 epoch: 9, loss: 0.2411709427833557 
In [23]:
y_ = linear_model(x_train)
plt.plot(x_train.data.numpy(), y_train.data.numpy(), 'bo', label='real')
plt.plot(x_train.data.numpy(), y_.data.numpy(), 'ro', label='estimated')
plt.legend()

Out[23]:
<matplotlib.legend.Legend at 0x7f5fe29c2710>

## 多项式回归模型

In [24]:
# 定义一个多变量函数
w_target = np.array([0.5, 3, 2.4]) # 定义参数
b_target = np.array([0.9]) # 定义参数

f_des = 'y = {:.2f} + {:.2f} * x + {:.2f} * x^2 + {:.2f} * x^3'.format(b_target[0], w_target[0], w_target[1], w_target[2])
print(f_des)

y = 0.90 + 0.50 * x + 3.00 * x^2 + 2.40 * x^3 

In [25]:
# 画出这个函数的曲线
x_samples = np.arange(-3, 3.1, 0.1)
y_samples = b_target[0] + w_target[0] * x_samples + w_target[1] * x_samples ** 2 + w_target[2] * x_samples ** 3
plt.plot(x_samples, y_samples, label='real curve')
plt.legend()

Out[25]:
<matplotlib.legend.Legend at 0x7f5fe27dfd68>

In [27]:
# 构建数据x 和 y
# x是一个如下矩阵 [x, x^2, x^3]
# y 是函数的结果[y]

x_train = np.stack([x_samples ** i for i in range(1, 4)], axis=1)
x_train = torch.from_numpy(x_train).float() # 转换成float tensor
y_train = torch.from_numpy(y_samples).float().unsqueeze(1) # 转换成float tensor

In [28]:
# 定义参数和模型

# 将x和y转换成Variable
x_train = Variable(x_train)
y_train = Variable(y_train)

def multi_linear(x):
return torch.mm(x, w) + b
In [29]:
# 画出更新之前的模型
y_pred = multi_linear(x_train)

plt.plot(x_train.data.numpy()[:, 0], y_pred.data.numpy(), label='fitting curve', color='r')
plt.plot(x_train.data.numpy()[:, 0], y_samples, label='real curve', color='b')
plt.legend()

Out[29]:
<matplotlib.legend.Legend at 0x7f5fe276d198>
In [30]:
# 这两条曲线的误差和一元的线性模型的误差是相同的，前面已经定义过get_loss
loss = get_loss(y_pred, y_train)
print(loss)

Variable containing: 1008.7871 [torch.FloatTensor of size 1] 
In [31]:
# 自动求导
loss.backward()

In [32]:
# 查看一下w 和 b的梯度

Variable containing: -92.1842 -123.2529 -601.3099 [torch.FloatTensor of size 3x1] Variable containing: -22.8959 [torch.FloatTensor of size 1] 
In [33]:
# 更新一下参数
b.data = b.data - 0.001*b.grad.data
In [34]:
# 画出更新后的模型
y_pred = multi_linear(x_train)
plt.plot(x_train.data.numpy()[:, 0], y_pred.data.numpy(), label='fitting curve', color='r')
plt.plot(x_train.data.numpy()[:, 0], y_samples, label='real curve', color='b')
plt.legend()

Out[34]:
<matplotlib.legend.Legend at 0x7f5fe2719c88>

In [35]:
# 100 次迭代
for epoch in range(100):
y_pred = multi_linear(x_train)
loss = get_loss(y_pred, y_train)

loss.backward()

if (epoch + 1) % 20 == 0:
print('epoch :{} , loss : {}'.format(epoch+1, loss.data[0]))
epoch :20 , loss : 52.10027313232422 epoch :40 , loss : 12.700457572937012 epoch :60 , loss : 3.4888031482696533 epoch :80 , loss : 1.3191393613815308 epoch :100 , loss : 0.7936816811561584 
In [36]:
# 画出更新后的结果
y_pred = multi_linear(x_train)
plt.plot(x_train.data.numpy()[:, 0], y_pred.data.numpy(), label='fitting curve', color='r')
plt.plot(x_train.data.numpy()[:, 0], y_samples, label='real curve', color='b')
plt.legend()

Out[36]:
<matplotlib.legend.Legend at 0x7f5fe484c438>