"Curve Fit" is a general purpose nonlinear fit application, adapted from a commercial
product I sold for the HP48 calculator years ago. It will fit data to
virtually any single valued function (linear or nonlinear) of a single variable, x.
The function can have up to 9 unknown fit parameters, and the data set can have up to
114 x,y pairs.
Important: This script requires 255 globals. For these reasons,it will only run on
RPN versions 3.60 or greater.
The script contains help information to remind the user how to
use it. As an elaboration on that help, what follows are more detailed
instructions with examples on how to use "Curve Fit".
Note 1: R0-R9 refer to the keyboard storage variables
accessed by dragging items on the stack to the buttons in order to store, and
dragging from the numbered buttons to the stack in order to recall. ie, R2 means
drag from the "2" button, to the stack. R0 is used to store the number of fit parameters
and R1-R9 contain the first quess to the fit parameters.
Note 2: When entering data, y is put on the stack first, followed by x.
Note 3: Since matrix operations occur on the stack, the function macro constructed
by the user MUST take one argument off the stack, x, and MUST return only one
argument to the stack, ie F(x). For the same reason, the user should use good stack
management when writing the function. There should be no reason to use more than 4 stack
levels. A 9 parameter fit will require 75 of the 80 available stack locations for it's
internal computations.
Note 4: It's always better to make a reasonable guess at the fit parameters, and
this will become ABSOLUTELY necessary for many nonlinear functions with multiple fit
parameters.
Note 5: A simple linear regression function, LR, is included for the simple linear
cases. It actually runs the nonlinear routine, but takes care of loading R0 and the
first guess, and does not need a macro. It is merely a convenience.
-------------------------------
Example 1. What is the equation of a line that goes through the (x,y) points
(2,5.4) & (-8, 7.6)?
Clear any previous data with clrS. Enter the data using "S+". (Remember, y is
entered first, then x.)
The equation is y=R1 + R2*x
Enter the macro, assuming x is on the stack,(ie. R2 * R1 +) and put 2 in R0
(2 fit parameters) & a first guess of 0 in R1 & R2.
Run "Fit" (choosing Equal weighting)
The program is done after 3 very fast iterations. Recall R2 & R1 to see -0.22 as
the slope & 5.84 as the y intercept. Probable errors in the fit parameters have
no meaning for the case of 2 data points and 2 fit parameters, so none are
returned to the stack.
Alternately, enter the x,y data and hit the "LR" button to see the same results.
-------------------------------
Example 2. Fit the following data:
y, x
301, 0
296, 15
230, 30
181, 45
170, 75
194, 90
167, 120
208, 150
312, 180
which represents detector counts vs angle in degrees for a scattering experiment,
to the function y=R1*P0+R2*P2+R3*P4
where P0, P2 & P4 are Legendre polynomials.
P0=1
P2=1/2 *(3*(cos(x))^2 -1)
P4=1/8 *(35*(cos(x))^4 - 30*(cos(x))^2 + 3)
Enter the data using the "S+" function. Clear old data first. Store 3 into R0,
and a first guess of 1 into R1, R2 & R3.Record the macro, making sure degrees mode
is set:
cos ENTER 4 ^ 35 * OVER 2 ^ 30 * - 3 + 8 / R3 * SWAP 2 ^ 3 * 1 - 2 / R2 * + R1 +
-- OVER is accomplished by dragging the second stack location to the ENTER button.---
Now execute "Fit" and choose Poisson, since the y data are counts & obey Poisson statistics.
Now watch the iteration count and the updated value of the reduced chi-square as the program
iterates to a solution.
After 3 iterations the solution converges. The reduced chi-squared and the 1 sigma error estimates
for the fit parameters are returned to the stack as 0.60296 (chi squared), & 5.5, 10.1,
& 12.0 for the R1, R2 & R3 errors respectively. Recall R1, R2 & R3 to see
R1=189.8, R2= 52.66 & R3=66.49.
This fit takes 11 seconds on my Tungsten C. My benchmarking indicates the
Tungsten C runs 5.5 times faster than a 16 MHz Dragonball Visor Deluxe. Though I
have not tested directly, this example should take about a minute to converge on
the Visor. Another test example I use with 9 fit parameters, 114 data points, and
a rather complex function, requires 4 iterations to converge to a solution after
8 minutes on my Tungsten C machine. This equates to 44 minutes on the Visor.
---------------------------------
Example 3. This example from the Bevington reference listed below, illustrates
why it is not a good idea to use linear fit routines for exponential equations.
Consider the following decay data:
y, ln(y), x
106, 4.663, 0
80, 4.382, 15
98, 4.585, 30
75, 4.317, 45
74, 4.304, 60
73, 4.290, 75
49, 3.892, 90
38, 3.638, 105
37, 3.611, 120
22, 3.091, 135
The decay equation is:
y=R1*exp(-x/R2)
Enter the (y,x) data, put 2 in R0 and first guess 100 in R1 & R2
Enter the macro: R2 / +/- exp R1 *
Do Fit with equal weighting, and after a few iterations, recall R1=108.1 &
R2=119.6. The 1 sigma errors are returned as 7.9 & 19 for R1 & R2 respectively.
Here, the importance of a reasonable first quess can be illustrated.
Try R1 & R2 set to 1, and watch the routine go unstable after a few iterations.
Now transform the exponential to linear form: ln(y)=ln(R1) + (1/R2)*x, clear the
old (x,y) data and reenter the data as (ln(y),x). Use the "LR" function to find
ln(R1)=4.7746, (-1/R2)= -0.01033. Transforming back to the original form, we get
R1=118.5 & R2=96.8. This is close to, but substantially different from the true
answer. What has happened is that in transforming the equation and
data, we have also transformed the weights of the y values, but ignored those weights
in the fit. The result is that the simple LR fit tends to erroneously optimize the
fit for the smaller y values.
-----------------------------------------------------------------------------------
For those with both the interest and the mathematical inclination, I use the
algorithm of Marquardt, which begins with a square "curvature" matrix of partial
derivatives of the function with respect to the fitting parameters. The j,k terms
of this matrix are the various cross products of partial derivatives over the
indexed fit parameters. The curvature matrix provides the direction of the
correction to the fit parameters. This matrix is augmented by a vector composed of
products of partial derivatives and the difference between the function and the y
data point. This vector provides an estimate of the magnitude and sign of the
correction to the fit parameters.
All elements in the augmented p by (p+1) matrix are summed over all data points,
where p is the number of fit parameters. The first correction to the initial fit
coefficient set is obtained by the solution to the augmented matrix, ie the
augmented matrix is diagonalized. Assuming the chi-square is reduced by the new
fit coefficient set, these corrected parameters become the new set. This continues
until a set of fit parameters is found that no longer reduces chi-squared.
Most nonlinear methods are variants of one of two techniques, a gradient search
or a parabolic expansion of chi-squared which is roughly equivalent to a linear
approximation to the fit function. A gradient search works well when far from the
chi-squared minimum but converges slowly as the minimum is approached, whereas
parabolic expansion of chi-squared works well close to the chi-squared minimum
but not very well far from the minimum. Marquardt’s method multiplies the
diagonal of the curvature matrix by a dynamically variable conditioning factor
so as to allow the algorithm to behave like a hybrid of the two methods.
References: "Data Reduction & Error Analysis", P.R. Bevington
& "Nonlinear Regression Analysis", D. Bates & D. Watts.
Revision History:
V3.0 First general release (for RPN v3.60). Nine parameter fit & 114 data points in a
single script.
V2.0 8 parameter fit and 101 data points. Eliminated compression. Reguires additional
dummy scripts for more than 10 data points, each script storing 13 data points.
--not a general released--
V1.0 8 parameter fit and 20 data points in a single script. Compressed x,y data pair
into a single global register at the cost of precision.
--not a general released--