I recently learned about the python function scipy.optimize.curve_fit,
and I’m really happy I did.
It fulfills a need I didn’t know I’d always had, but never
fulfilled: I often have a dataset and a function with some
parameters, and I just want the damn parameters to be fitted
to that dataset, even if imperfectly. Please don’t ask any
more annoying questions like “Is the dataset generated by a
Gaussian?”
or “Is the underlying process
ergodic?”, just fit the
goddamn curve!
And scipy.optimize.curve_fit does exactly that!
You give it a function f with some parameters a, b, c, … and a
dataset consisting of input values x and output values y, and it
then optimizes a, b, c, … so that f(x, a, b, c, …) is as close as
possible to y (where, of course, x and y can both be numpy arrays).
This is awesome! I have some datapoints x, y and I believe it’s
generated by some obscure function, let’s say of the form
f(x,a,b,c)=a⋅x⋅sin(b⋅x+c), but I don’t know
the exact values for a, b and c?
No problem! I just throw the whole thing into curve_fit
(scipy.optimize.curve_fit(f, x, y)) and out comes an array of optimal
values for a, b, c!
What if I then want c to be necessarily positive?
Trivial!curve_fit comes with an optional argument called bounds,
since b is the second argument, I call
scipy.optimize.curve_fit(f, x, y, bounds=([-numpy.inf, -numpy.inf, 0], numpy.inf)),
which says that curve_fit should not make the second argument smaller
than zero, but otherwise can do whatever it wants.
So far, I’ve already used this function
twotimes,
and I’ve only known about it for a week! A must for every wannabe
data-scientist.
For more information about this amazing function, consult its
documentation.
scipy.optimize.curve_fit Is Awesome
cross-posted from niplav.github.io
I recently learned about the python function
scipy.optimize.curve_fit
, and I’m really happy I did.It fulfills a need I didn’t know I’d always had, but never fulfilled: I often have a dataset and a function with some parameters, and I just want the damn parameters to be fitted to that dataset, even if imperfectly. Please don’t ask any more annoying questions like “Is the dataset generated by a Gaussian?” or “Is the underlying process ergodic?”, just fit the goddamn curve!
And
scipy.optimize.curve_fit
does exactly that!You give it a function
f
with some parametersa, b, c, …
and a dataset consisting of input valuesx
and output valuesy
, and it then optimizesa, b, c, …
so thatf(x, a, b, c, …)
is as close as possible toy
(where, of course,x
andy
can both be numpy arrays).This is awesome! I have some datapoints
x
,y
and I believe it’s generated by some obscure function, let’s say of the form f(x,a,b,c)=a⋅x⋅sin(b⋅x+c), but I don’t know the exact values fora
,b
andc
?No problem! I just throw the whole thing into
curve_fit
(scipy.optimize.curve_fit(f, x, y)
) and out comes an array of optimal values fora, b, c
!What if I then want
c
to be necessarily positive?Trivial!
curve_fit
comes with an optional argument calledbounds
, sinceb
is the second argument, I callscipy.optimize.curve_fit(f, x, y, bounds=([-numpy.inf, -numpy.inf, 0], numpy.inf))
, which says thatcurve_fit
should not make the second argument smaller than zero, but otherwise can do whatever it wants.So far, I’ve already used this function two times, and I’ve only known about it for a week! A must for every wannabe data-scientist.
For more information about this amazing function, consult its documentation.