Least Squares Linear Regression (last updated 9/9/99)
The information in this tutorial is located in the MATLAB manual. Any page or section numbers refer to the following:
The Student Edition of MATLAB, Version 5, The MATH WORKS Inc. Prentice-Hall, 1997.
======================================================================================
This tutorial contains the following sections;
Example Problem - Least Squares Linear Regression
Solution Using Linear Algebra Methods in the Regular Version of MATLAB
Solution Using the regress Command in the Statistics Toolbox
=============================================================================================================
Determine the parameters 'a1', 'a2' and 'a3' in the equation, y = a1 + a2x1 +a3x2, given a set of n data points (x1,x2,y).
a1 + a2x11 + a3x21
= y1
a1 + a2x12 + a3x22
= y2
a1 + a2x13 + a3x23
= y3
a1 + a2x14 + a3x24
= y4
etc.
where the first subscript on x identifies the independent variable and the second subscript signifies the data point.
In matrix notation this is expressed as;

(a) Using the regress command in the Statistics Toolbox,
(b) Using linear algebra methods in the basic MATLAB package. When you have a linear algebra problem with more equations than unknowns MATLAB defaults to a least squares solution which is what a linear regression uses.
=============================================================================================================
Example Problem - Linear Least Squares Regression
Develop a linear correlation to predict the final weight of an animal based on the initial weight and the amount of feed eaten.
(final weight) = a1 + a2*(initial weight) + a3*(feed eaten)
The following data are given:
| final weight | initial weight | feed eaten |
| 95 | 42 | 272 |
| 77 | 33 | 226 |
| 80 | 33 | 259 |
| 100 | 45 | 292 |
| 97 | 39 | 311 |
| 70 | 36 | 183 |
| 50 | 32 | 173 |
| 80 | 41 | 236 |
| 92 | 40 | 230 |
| 84 | 38 | 235 |
=============================================================================================================
Solution Using Linear Algebra Methods in the Regular Version of MATLAB.
Using the linear algebra commands of the basic version of MATLAB has the advantage that you can perform linear regression using the Student Edition of MATLAB. The disadvantage is that statistical information is not available, (ie. no residuals, confidence intervals or correlation coefficients).
This uses the linear algebra notation to solve the equation
Ax = b. This is solved using the statement x = A\b.
_______________________________________________________________________________________
Here is the m-file used to solve the example problem
using linear algebra methods
_______________________________________________________________________________________
% Darin Ridgway
% ChE XXX
% July 11, 1998
% Load the independent variables into vectors.
initwgt = [ 42 33 33 45 39 36 32 41 40 38];
feed = [ 272 226 259 292 311 183 173 236 230 235];
fw = [95; 77; 80; 100; 97; 70; 50; 80; 92; 84]
;
% Then create the x matrix from these
vectors using a for loop.
% You could create the x matrix directly,
but you will possibly want the vectors of the independent variables later
for plotting.
% The dependent variable goes into a 10x1
column vector.
for n = 1:10
x(n,1) = 1;
x(n,2) = initwgt(n);
x(n,3) = feed(n);
y(n,1) = fw(n);
end
% Use the matrix division operation.
Note: The notation x = b/A will not work.
a = x\fw;
% Calculate the values of the final weight
predicted by the equation.
% Then calculate the difference between the
experimental and predicted values
fwpred = x*a;
res = fw - fwpred;
% Create output
disp(' Darin Ridgway')
disp(' ChE XXX')
disp(' July 11, 1998')
disp(' Example Problem X')
fprintf('\n\n\n The parameters a1, a2, and
a3 respectively are \n')
fprintf(' a1 = %5.2e \n', a(1))
fprintf(' a2 = %5.2e \n', a(2))
fprintf(' a3 = %5.2e \n\n', a(3))
fprintf(' Experimental wgt Predicted wgt Difference \n')
for n = 1:10
fprintf('
%5.2e %5.2e
%5.2e \n', fw(n), fwpred(n), res(n))
end
_________________________________________________________________________________________________
Here is the response in the Command Window
_________________________________________________________________________________________________
Darin Ridgway
ChE XXX
July 11, 1998
Example Problem X
The parameters a1, a2, and a3 respectively
are
a1 = -2.30e+001
a2 = 1.40e+000
a3 = 2.18e-001
Experimental wgt Predicted
wgt Difference
9.50e+001
9.48e+001 1.84e-001
7.70e+001
7.22e+001 4.76e+000
8.00e+001
7.94e+001 5.74e-001
1.00e+002
1.03e+002 -3.36e+000
9.70e+001
9.91e+001 -2.12e+000
7.00e+001
6.71e+001 2.93e+000
5.00e+001
5.93e+001 -9.32e+000
8.00e+001
8.56e+001 -5.59e+000
9.20e+001
8.29e+001 9.12e+000
8.40e+001
8.12e+001 2.82e+000
__________________________________________________________________________
After calculating the equation parameters you may wish
to plot the function versus the data points, especially if there is a single
independent variable. Use plot to plot the data as discrete
points and fplot to plot the function.
==============================================================================================================
Solution Using the regress Command in the Statistics Toolbox
Note: The Statistics Toolbox is not in the Student Edition of MATLAB. You must use a machine with the professional version.
The regress command allows you to obtain a great deal of statisitical information. These are included in the command to call the regress routine. These are:
Here is the m-file used to solve the example problem
using regress
_______________________________________________________________________
% Darin Ridgway
% ChE XXX
% July 11, 1998
% Load the independent variables into vectors.
initwgt = [ 42 33 33 45 39 36 32 41 40 38];
feed = [ 272 226 259 292 311 183 173 236 230 235];
fw = [95; 77; 80; 100; 97; 70; 50; 80; 92; 84]
;
% Then create the x matrix from these
vectors using a for loop.
% You could create the x matrix directly,
but you will possibly want the vectors of the independent variables later
for plotting.
% The dependent variable goes into a 10x1
column vector.
for n = 1:10
x(n,1) = 1;
x(n,2) = initwgt(n);
x(n,3) = feed(n);
y(n,1) = fw(n);
end
% Set the value of alpha and call the regress
command
alpha = 0.80;
[a,aint,res,rint,stats] = regress(fw,x,alpha)
________________________________________________________________________
Here is the output when you execute the m-file.
It is all shown here to demonstrate the information regress provides.
In a submission you should use formatted output.
_________________________________________________________________________
a =
-22.9932
1.3957
0.2176
aint =
-27.6677 -18.3187
1.2424 1.5490
0.2024 0.2328
res =
0.1841
4.7553
0.5741
-3.3552
-2.1158
2.9257
-9.3155
-5.5862
9.1152
2.8184
rint =
1.3520 1.7202
3.3668 6.1439
-0.7075 1.8557
-4.6340 -2.0764
-3.3535 -0.8782
1.5472 4.3041
-10.1864 -8.4446
-6.9905 -4.1818
7.9047 -10.3256
1.2196 4.4172
stats =
0.8732 24.0934 0.0007
____________________________________________________________________
To calculate the predicted values use either method given here.
After calculating the equation parameters you may wish
to plot the function versus the data points, especially if there is a single
independent variable. Use plot to plot the data as discrete
points and fplot to plot the function.