首页人工智能MXNet7.以多项式函数拟合为例,...

7.以多项式函数拟合为例,理解模型复杂度和训练集大小对欠拟合和过拟合的影响

以如下三阶多项式函数来生成样本的标签:

$$
y\,\,=\,\,1.2x\,\,-\,\,3.4x^2\,\,+\,\,5.6x^3\,\,+\text{5 }+\,\,\varepsilon
$$

import d2lzh as d2l
from mxnet import autograd, gluon, nd
from mxnet.gluon import data as gdata, loss as gloss, nn

# 生成数据集
n_train, n_test, true_w, true_b = 100, 100, [1.2, -3.4, 5.6], 5
features = nd.random.normal(shape=(n_train + n_test, 1))
poly_features = nd.concat(features, nd.power(features, 2), nd.power(features, 3))
labels = (true_w[0] * poly_features[:, 0] + true_w[1] * poly_features[:, 1] + true_w[2] * poly_features[:, 2] + true_b)
labels += nd.random.normal(scale=0.1, shape=labels.shape)


# 定义作图函数semilogy,其中y轴使用了对数尺度
def semilogy(x_vals, y_vals, x_label, y_label, x2_vals=None, y2_vals=None, legend=None, figsize=(3.5, 2.5)):
	d2l.set_figsize(figsize)
	d2l.plt.xlabel(x_label)
	d2l.plt.ylabel(y_label)
	d2l.plt.semilogy(x_vals, y_vals)
	if x2_vals and y2_vals:
		d2l.plt.semilogy(x2_vals, y2_vals, linestyle=':')
		d2l.plt.legend(legend)
		d2l.plt.show()


# 定义模型
num_epochs, loss = 100, gloss.L2Loss()


def fit_and_plt(train_features, test_features, train_labels, test_labels):
	net = nn.Sequential()
	net.add(nn.Dense(1))
	net.initialize()
	batch_size = min(10, train_labels.shape[0])
	train_iter = gdata.DataLoader(gdata.ArrayDataset(train_features, train_labels), batch_size, shuffle=True)
	trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.01})
	train_ls, test_ls = [], []
	for _ in range(num_epochs):
		for X, y in train_iter:
			with autograd.record():
				l = loss(net(X), y)
			l.backward()
			trainer.step(batch_size)
		train_ls.append(loss(net(train_features), train_labels).mean().asscalar())
		test_ls.append(loss(net(test_features), test_labels).mean().asscalar())

	print('final epoch:train loss', train_ls[-1], 'test loss', test_ls[-1])
	semilogy(range(1, num_epochs + 1), train_ls, 'epochs', 'loss', range(1, num_epochs + 1), test_ls, ['train', 'test'])
	print('weight:', net[0].weight.data().asnumpy(), '\nbias:', net[0].bias.data().asnumpy())

1.三阶多项式函数拟合(正常):

我们先使用与数据生成函数同阶的三阶多项式函数拟合。实验表明,这个模型的训练误差和在测试数据集的误差都较低。训练出的模型参数也接近真实值:

# 拟合正常的情况——训练误差和在测试数据集上的误差都很小
fit_and_plt(poly_features[:n_train, :], poly_features[n_train:, :], labels[:n_train], labels[n_train:])

得到如下结果:

final epoch:train loss 0.00795475 test loss 0.010587299
weight: [[ 1.0719422 -3.32675    5.6385565]] 
bias: [4.8979206]

2.线性函数拟合(欠拟合):

我们再试试线性函数拟合。很明显,该模型的训练误差在迭代早期下降后便很难继续降低。在完成最后一次迭代周期后,训练误差依旧很高。线性模型在非线性模型(如三阶多项式函数)生成的数据集上容易欠拟合。

# 欠拟合情况
fit_and_plt(features[:n_train, :], features[n_train:, :], labels[:n_train], labels[n_train:])

得到如下结果:

final epoch:train loss 148.14888 test loss 74.99101
weight: [[18.996931]] 
bias: [-0.06482039]

3.训练样本不足(过拟合):

事实上,即便使用与数据生成模型同阶的三阶多项式函数模型,如果训练样本不足,该模型依然容易过拟合。让我们只使用两个样本来训练模型。显然,训练样本过少了,甚至少于模型参数的数量。这使模型显得过于复杂,以至于容易被训练数据中的噪声影响。在迭代过程中,尽管训练误差较低,但是测试数据集上的误差却很高。这是典型的过拟合现象。

# 过拟合情况
fit_and_plt(poly_features[0:5, :], poly_features[n_train:, :], labels[0:5], labels[n_train:])

得到如下结果:

final epoch:train loss 1.6807058 test loss 208.46396
weight: [[0.04566861 0.3971918  0.09108218]] 
bias: [2.4448707]

Reference:
《动手学深度学习》

RELATED ARTICLES

欢迎留下您的宝贵建议

Please enter your comment!
Please enter your name here

- Advertisment -

Most Popular

Recent Comments