CS 221 Fall 2011 -- Extra Credit Problem 2
CS 221 Fall 2011 Extra Credit Problem 2
Due: 23:59:59 Sunday, December 11
This problem is worth a maximum of 2 points (2%) of your final overall grade.
Automated Curve-Fitting
For this problem you will write a function dofits(x,y,k)
to automatically test several forms of function to see which fits best
a given set of data. Here x and y are column
vectors, where the x's are the independent values
and the y's are the dependent values; x and y
must be the same length. k is the maximum degree of polynomial
that should be fitted to the data. The functions fitted to the curve
are:
-
Polynomials up to degree k (the parameter).
-
Exponential: y = α eβ x.
-
Power-law: y = α xβ.
For each of the above types of curves, the
function does the following:
-
If necessary, transform the data by taking logarithms, so the form is
a polynomial. For example, to fit to an exponential curve, the y
values must be replaced by their natural logarithms.
-
Call polyfit() on the (possibly transformed) data
to deterine the coefficients of the polynomial.
-
Call polyfit() again on the measured and predicted data, to
determine the goodness-of-fit by comparing the
(measured, predicted) pairs with the x=y line.
So for example if the data is fitted to a
polynomial, and the coefficients returned
by the first call to polyfit() are in c,
you would call polyfit(y,polyval(c,x),1) to get the slope and
intercept of the best-fitting line. (Recall that "goodness of fit"
can be quantified by the slope and intercept of the measured
vs. predicted line, which should be close to 1 and 0, respectively).
-
Print the computed parameters of each fitted curve
(i.e., the coefficients for the polynomials,
α and β for the exponential and power-law), along with the
goodness-of-fit parameters (i.e., the coefficients returned
by polyfit()) for each.
Requirements:
- Use fprintf() to produce output formatted as in the
examples below (note: the default precision for %f is six decimal places).
- Do not print anything else.
Note: You may find the
slides from this lecture helpful.
Example:
For example, if you load this data set,
set x = ecdata(:,1) and y = ecdata(:,2),
and then call dofits(x,y,3), it should print:
degree 1: [ 0.199847 5.037114 ]
goodness: 0.998814 0.053474
degree 2: [ -0.000004 0.201369 4.935181 ]
goodness: 0.998818 0.053301
degree 3: [ -0.000000 0.000003 0.200215 4.973975 ]
goodness: 0.998819 0.053283
exponential: beta=0.005606, alpha=12.162348;
goodness: 1.185886, -7.713574
power law: beta=0.689275, alpha=1.191440;
goodness: 0.841387, 5.983828
(The actual curve here is a line y = 0.2x + 5; the fit
is not exact because noise has been added to the measurements.
Note that when coefficients of higher powers in
the polynomial are close to zero, a
lower-degree polynomial is indicated, even if the "goodness of
fit" is slightly better for the higher-degree one.)
For this data set,
it should print:
degree 1: [ 17174.793999 -390883.931651 ]
goodness: 0.831667 80200.898188
degree 2: [ 294.516124 -12571.334535 114800.253429 ]
goodness: 0.994642 2552.792031
degree 3: [ 1.813078 19.834749 -1418.908110 18606.836039 ]
goodness: 0.998609 662.759383
exponential: beta=0.060602, alpha=5035.903742;
goodness: 1.337127, -111070.266422
power law: beta=1.715680, alpha=-66.611871;
goodness: 0.449575, 92871.666940
(Actual equation for the above data is a cubic polynomial with
parameters 2, -10, 8, and 100.)
For this one, you should get:
degree 1: [ 78.547616 -259.912497 ]
goodness: 0.991321 11.719844
degree 2: [ 0.677427 50.773115 -62.822336 ]
goodness: 0.998946 1.423064
degree 3: [ -0.010890 1.347143 39.578267 -20.960165 ]
goodness: 0.999143 1.157854
exponential: beta=0.086567, alpha=156.203270;
goodness: 1.376870, -421.973310
power law: beta=1.282841, alpha=26.245491;
goodness: 0.982327, 13.865161
Note that the "goodness of fit" measure is not foolproof, as the
actual data in this case was generated by a power law, with α
= 25 and β = 1.3 (y = 25x1.3), but the
parabola and cubic polynomial both have much better goodness-of-fit.