这其实是Couseral 上Andrew Ng 的 Machine learning上的作业,提供了几乎所有的框架,只需要我们完成几个函数
1. 画出原始数据图
%% ======================= Part 1: Plotting ======================= fprintf(\'Plotting Data ...\n\') data = load(\'ex1data1.txt\'); X = data(:, 1); y = data(:, 2); m = length(y); % number of training examples % Plot Data % Note: You have to complete the code in plotData.m plotData(X, y); fprintf(\'Program paused. Press enter to continue.\n\'); pause;
这一步首先用load函数输入数据,接着要求我们实现画出原始数据的图像的函数
function plotData(x, y) %PLOTDATA Plots the data points x and y into a new figure % PLOTDATA(x,y) plots the data points and gives the figure axes labels of % population and profit. figure; % open a new figure window plot(x,y,\'rx\',\'MarkerSize\',10); ylabel(\'Profit in $10,000s\'); % Set the y?axis label xlabel(\'Population of City in 10,000s\'); % Set the x?axis label
end
2. 损失函数和梯度下降算法
原始图像画好后,就是linear regression的核心, 梯度下降算法
%% =================== Part 2: Cost and Gradient descent =================== X = [ones(m, 1), data(:,1)]; % Add a column of ones to x theta = zeros(2, 1); % initialize fitting parameters % Some gradient descent settings iterations = 1500; alpha = 0.01; fprintf(\'\nTesting the cost function ...\n\') % compute and display initial cost J = computeCost(X, y, theta); fprintf(\'With theta = [0 ; 0]\nCost computed = %f\n\', J); fprintf(\'Expected cost value (approx) 32.07\n\'); % further testing of the cost function J = computeCost(X, y, [-1 ; 2]); fprintf(\'\nWith theta = [-1 ; 2]\nCost computed = %f\n\', J); fprintf(\'Expected cost value (approx) 54.24\n\'); fprintf(\'Program paused. Press enter to continue.\n\'); pause; fprintf(\'\nRunning Gradient Descent ...\n\') % run gradient descent theta = gradientDescent(X, y, theta, alpha, iterations);
% print theta to screen
fprintf(\'Theta found by gradient descent:\n\');
fprintf(\'%f\n\', theta);
fprintf(\'Expected theta values (approx)\n\');
fprintf(\' -3.6303\n 1.1664\n\n\');
% Plot the linear fit
hold on; % keep previous plot visible
plot(X(:,2), X*theta, \'-\')
legend(\'Training data\', \'Linear regression\')
hold off % don\'t overlay any more plots on this figure
% Predict values for population sizes of 35,000 and 70,000
predict1 = [1, 3.5] *theta;
fprintf(\'For population = 35,000, we predict a profit of %f\n\',...
predict1*10000);
predict2 = [1, 7] * theta;
fprintf(\'For population = 70,000, we predict a profit of %f\n\',...
predict2*10000);
fprintf(\'Program paused. Press enter to continue.\n\');
pause;
在这段代码中,首先是对输入X进行了处理,在前面加了一行1, 这样常量theta0相当于和这个1相乘,这个1也作为X的一个feature
第二步是检验了Loss function的正确性,也就是 computeCost函数
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples
J=0;
% You need to return the following variables correctly
for i=1:m
J=J+((X(i,:)*theta-y(i))^2)/(2*m);
end
end
第三步是使用梯度下降函数,来得到预测函数的参数theta,并得到在梯度下降过程中loss function的变化 J_history
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta % theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha % Initialize some useful values m = length(y); % number of training examples J_history = zeros(num_iters, 1); for iter = 1:num_iters % ====================== YOUR CODE HERE ====================== % Instructions: Perform a single gradient step on the parameter vector % theta. % % Hint: While debugging, it can be useful to print out the values % of the cost function (computeCost) and gradient here. % fprintf("The %d iteration\n",iter); temp0=theta(1)-alpha*sum(X*theta-y)/m; temp1=theta(2)-alpha*(X(:,2)\'*(X*theta-y))/m; theta(1)=temp0; theta(2)=temp1; % ============================================================ % Save the cost J in every iteration J_history(iter) = computeCost(X, y, theta); fprintf("The cost is %d\n",J_history(iter)); end end
第四步是画出了预测值和正确数据的对比图,并检验了预测的效果,这里没有函数需要实现
3.画出损失函数和梯度下降的等高图
最后,我们画出损失函数和theta的关系图,并画出等高线,标记出最终得到的theta
%% ============= Part 3: Visualizing J(theta_0, theta_1) ============= fprintf(\'Visualizing J(theta_0, theta_1) ...\n\') % Grid over which we will calculate J theta0_vals = linspace(-10, 10, 100); theta1_vals = linspace(-1, 4, 100); % initialize J_vals to a matrix of 0\'s J_vals = zeros(length(theta0_vals), length(theta1_vals)); % Fill out J_vals for i = 1:length(theta0_vals) for j = 1:length(theta1_vals) t = [theta0_vals(i); theta1_vals(j)]; J_vals(i,j) = computeCost(X, y, t); end end % Because of the way meshgrids work in the surf command, we need to % transpose J_vals before calling surf, or else the axes will be flipped J_vals = J_vals\'; % Surface plot figure; surf(theta0_vals, theta1_vals, J_vals) xlabel(\'\theta_0\'); ylabel(\'\theta_1\'); % Contour plot figure; % Plot J_vals as 15 contours spaced logarithmically between 0.01 and 100 contour(theta0_vals, theta1_vals, J_vals, logspace(-2, 3, 20)) xlabel(\'\theta_0\'); ylabel(\'\theta_1\'); hold on; plot(theta(1), theta(2), \'rx\', \'MarkerSize\', 10, \'LineWidth\', 2);
这样就结束了,看起来很简单,主要是为我们搭好了框架,繁琐的画图过程也给我们省去了