modeling — Modeling#
The modeling module provides all the core tools for fitting models to data, including
parameter estimation and uncertainty quantification. All predefined simple models
are defined in the models module, which builds on top of the
functionality provided here.
More complex models can be built by summing simple ones, e.g.,
>>> from aptapy.models import Line, Gaussian
>>>
>>> model = Line() + Gaussian()
The main fitting engine supports bounded fits and/or fits with fixed parameters.
See also
The models — Fitting models section lists all the predefined fitting models.
Also, have a look at the Simple fit, Composite fit and Constrained fit examples.
Parameters#
The first central concept in the modeling module is that of a fit parameter,
represented by the FitParameter class. A fit parameter
is a named mutable object that holds a value, an optional uncertainty, and optional
bounds, along with a flag that indicate whether they should be varied or not in a fit.
FitParameter objects provide all the facilities for
pretty-printing their value and uncertainty. The following example shows the basic
semantics of the class:
>>> from aptapy.modeling import FitParameter
>>>
>>> param = FitParameter(1.0, "amplitude", error=0.1)
>>> print(param)
Amplitude: 1.0 ± 0.1
Fit status#
FitStatus is a small bookkeeping class that holds all the
information about the status of a fit, such as the chisquare, the number of degrees of
freedom and the fit range.
Warning
At this point the implementation of the class is fairly minimal, and it is very likely that we will be adding stuff along the way.
Simple models#
Chances are you will not have to interact with FitParameter
and FitStatus objects a lot, but they are central to defining
and using simple fit models, and heavily used internally.
The easiest way to see how you would go about defining an actual fit model is to look at the source code for a simple one.
1class Line(AbstractFitModel):
2
3 """Linear model.
4 """
5
6 slope = FitParameter(1.)
7 intercept = FitParameter(0.)
8
9 @staticmethod
10 def evaluate(x: ArrayLike, slope: float, intercept: float) -> ArrayLike:
11 # pylint: disable=arguments-differ
12 return slope * x + intercept
13
14 def init_parameters(self, xdata: ArrayLike, ydata: ArrayLike, sigma: ArrayLike = 1.) -> None:
15 """Overloaded method.
16
17 This is simply using a weighted linear regression.
18
19 .. note::
20
21 This should provide the exact result in most cases, but, in the spirit of
22 providing a common interface across all models, we are not overloading the
23 fit() method. (Everything will continue working as expected, e.g., when
24 one uses bounds on parameters.)
25 """
26 # pylint: disable=invalid-name
27 if isinstance(sigma, Number):
28 sigma = np.full(ydata.shape, sigma)
29 weights = 1. / sigma**2.
30 S0x = weights.sum()
31 S1x = (weights * xdata).sum()
32 S2x = (weights * xdata**2.).sum()
33 S0xy = (weights * ydata).sum()
34 S1xy = (weights * xdata * ydata).sum()
35 D = S0x * S2x - S1x**2.
36 if D != 0.:
37 self.slope.init((S0x * S1xy - S1x * S0xy) / D)
38 self.intercept.init((S2x * S0xy - S1x * S1xy) / D)
39
40 def primitive(self, x: ArrayLike) -> ArrayLike:
41 slope, intercept = self.parameter_values()
42 return 0.5 * slope * x**2 + intercept * x
All we really have to do is to subclass AbstractFitModel,
listing all the fit parameters as class attributes (assigning them sensible default
values), and implement the evaluate() method,
which takes as first argument the independent variable and then the values of all the
fit parameters.
Note
It goes without saying that the order of the fit parameters in the argument
list of the evaluate() method must
match the order in which they are defined as class attributes.
In this particular case we are sayng that the Line model has two fit parameters,
intercept and slope, and, well, the model itself evaluates as a straight line
as we would expect.
When we create an instance of a fitting model
>>> model = Line()
a few things happen under the hood:
the class instance gets its own copy of each fit parameter, so that we can change their values and settings without affecting the class definition, nor other class instances;
the class instance registers the fit parameters as attributes of the instance, so that we can access them as, e.g.,
model.intercept,model.slope.
That it’s pretty much it. The next thing that you proabably want to do is to fit
the model to a series of data points, which you do in pretty much the same fashion
as you would do with scipy.optimize.curve_fit using the
fit() method. This will return a
FitStatus() object containing information about the fit.
Fitting primer#
Assuming that you have a set of data points xdata, ydata, the latter with
associated uncertainties yerrors, the simplest fit goes like
>>> from aptapy.modeling import Line
>>>
>>> model = Line()
>>> status = model.fit(xdata, ydata, sigma=yerrors)
You can fit within a subrange of the input data by specifying the min and/or
the max keyword arguments:
>>> from aptapy.modeling import Line
>>>
>>> model = Line()
>>> status = model.fit(xdata, ydata, sigma=yerrors, xmin=0., xmax=10.)
You can set bounds on the fit parameters, e.g., force the slope to be positive by doing
>>> from aptapy.modeling import Line
>>>
>>> model = Line()
>>> model.slope.minimum = 0.
>>> status = model.fit(xdata, ydata, sigma=yerrors)
and you can freeze any of the parameters to a fixed value during the fit
>>> from aptapy.modeling import Line
>>>
>>> model = Line()
>>> model.intercept.freeze(0.)
>>> status = model.fit(xdata, ydata, sigma=yerrors)
Or, really, any linear combination of the above. The fit status is able to pretty-print itself, and the fitted model can be plotted by just doing
>>> model.plot()
>>> plt.legend()
(The legend bit is put there on purpose, as by default the fitting model will add a nice entry in the legend with all the relevant information.)
Fitting models interact nicely with one-dimensional histograms from the
hist, module so you can do
>>> import numpy as np
>>> from aptapy.hist import Histogram1D
>>> from aptapy.modeling import Line
>>>
>>> hist = Histogram1D(np.linspace(0., 1., 100))
>>> hist.fill(np.random.rand(1000))
>>> model = Line()
>>> status = model.fit(hist)
Location-scale models#
A number of models included in this module belong to the location-scale family, i.e., they can be expressed in terms of a location parameter and a non-negative scale parameter and, ultimately, are characterized by a universal shape function \(g(z)\) of the standardized variable
where \(m\) is the location parameter and \(s\) is the scale parameter. The gaussian probability density function is the prototypical example of location-scale model (with the mean as location and the standard deviation as scale), but many other models belong to this family—both peak-like and sigmoid-like.
From the point of view of the practical implementation, most of the location-scale
models in aptapy.models wrap scipy.stats.rv_continuous distributions from
scipy.stats, which already provide most of the necessary functionality.
Sigmoid models#
Location-scale sigmoid-like models all inherit from the abstract base class
AbstractSigmoidFitModel, and, just like in the previous
case, they must provide a concrete implementation of the shape function.
The latter is expected to be a monotonically increasing function, ranging
from 0 to 1 as its argument goes from -infinity to +infinity, and the meaning
of the amplitude parameter in this case is that of the total change in the function
value across the transition region, so that the general model reads
(note that there is no division by the scale parameter in this case). Conversely, the implementation of the evaluation method reads:
def evaluate(self, x: ArrayLike, amplitude: float, location: float,
scale: float, *parameter_values: float) -> ArrayLike:
"""Overloaded method for evaluating the model.
Note if the scale is negative, we take the complement of the sigmoid function.
"""
# pylint: disable=arguments-differ
z = (x - location) / abs(scale)
val = amplitude * self.shape(z, *parameter_values)
return val if scale >= 0. else amplitude - val
Gaussian forests#
The GaussianForestBase class provides a base for
implementing models made of a forest of Gaussian peaks with fixed energies.
An example of such a model is provided by the Fe55Forest
class, which implements a forest of Gaussian peaks for the Kα and Kβ emission
features of \(^{55}\mathrm{Fe}\) decay.
Model of a forest of Gaussian peaks with fixed energies \(E_i\). Each peak has amplitude \(A_i\), while all peaks share a global energy scale \(E_s\) and a common width \(\sigma\), scaled as \(1/\sqrt{E_i / E_0}\).
Note
Base class used by models generated with aptapy.modeling.line_forest().
Composite models#
The modeling module also provides a way to build composite models by summing
simple ones. This is achieved by means of the FitModelSum,
which is design to hold a list of components and interoperate with the rest
of the world in exactly the same fashion as simple models.
Chances are you will never have to instantiate a FitModelSum
object directly, as the + operator will do the trick in most of the cases, e.g.,
>>> from aptapy.modeling import Line, Gaussian
>>> model = Line() + Gaussian()
>>> status = model.fit(xdata, ydata, sigma=yerrors)
Module documentation#
Modeling core facilities.
- class aptapy.modeling.Format(*values)[source]#
Small enum class to control string formatting.
This is leveraging the custom formatting of the uncertainties package, where a trailing P means “pretty print” and a trailing L means “LaTeX”.
- PRETTY = 'P'#
- LATEX = 'L'#
- class aptapy.modeling.FitParameter(value: float, _name: str = None, error: float = None, _frozen: bool = False, minimum: float = -inf, maximum: float = inf)[source]#
Small class describing a fit parameter.
- value: float#
- _name: str = None#
- error: float = None#
- _frozen: bool = False#
- minimum: float = -inf#
- maximum: float = inf#
- property name: str#
Return the parameter name.
We are wrapping this into a property because, arguably, the parameter name is the only thing we never, ever want to change after the fact.
Returns#
- namestr
The parameter name.
- property frozen: bool#
Return True if the parameter is frozen.
We are wrapping this into a property because we interact with this member via the freeze() and thaw() methods.
Returns#
- frozenbool
True if the parameter is frozen.
- is_bound() bool[source]#
Return True if the parameter is bounded.
Returns#
- boundedbool
True if the parameter is bounded.
- copy(name: str) FitParameter[source]#
Create a copy of the parameter object with a new name.
This is necessary because we define the fit parameters of the actual model as class variables holding the default value, and each instance gets their own copy of the parameter, where the name is automatically inferred.
Note that, in addition to the name being passed as an argument, we only carry over the value and bounds of the original fit parameter: the new object is created with error = None and _frozen = False.
Arguments#
- namestr
The name for the new
FitParameterobject.
Returns#
- parameterFitParameter
The new
FitParameterobject.
- set(value: float, error: float = None) None[source]#
Set the parameter value and error.
Arguments#
- valuefloat
The new value for the parameter.
- errorfloat, optional
The new error for the parameter (default None).
- init(value: float) None[source]#
Initialize the fit parameter to a given value, unless it is frozen, or the value is out of bounds.
Warning
Note this silently does nothing if the parameter is frozen, or if the value is out of bounds, so its behavior is inconsistent with that of set(), which raises an exception in both cases. This is intentional, and this method should only be used to initialize the parameter prior to a fit.
Arguments#
- valuefloat
The new value for the parameter.
- freeze(value: float) None[source]#
Freeze the fit parameter to a given value.
Note that the error is set to None.
Arguments#
- valuefloat
The new value for the parameter.
- ufloat() ufloat[source]#
Return the parameter value and error as a ufloat object.
Returns#
- ufloatuncertainties.ufloat
The parameter value and error as a ufloat object.
- pull(expected: float) float[source]#
Calculate the pull of the parameter with respect to a given expected value.
Arguments#
- expectedfloat
The expected value for the parameter.
Returns#
- pullfloat
The pull of the parameter with respect to the expected value, defined as (value - expected) / error.
Raises#
- RuntimeError
If the parameter has no error associated to it.
- compatible_with(expected: float, num_sigma: float = 3.0) bool[source]#
Check if the parameter is compatible with an expected value within n_sigma.
Arguments#
- expectedfloat
The expected value for the parameter.
- num_sigmafloat, optional
The number of sigmas to use for the compatibility check (default 3).
Returns#
- compatiblebool
True if the parameter is compatible with the expected value within num_sigma.
- class aptapy.modeling.FitStatus(popt: ndarray = None, pcov: ndarray = None, chisquare: float = None, dof: int = None, pvalue: float = None, correlated_pars: ndarray = None)[source]#
Small dataclass to hold the fit status.
- popt: ndarray = None#
- pcov: ndarray = None#
- chisquare: float = None#
- dof: int = None#
- pvalue: float = None#
- valid() bool[source]#
Return True if the fit status is valid, i.e., all the fields are set.
Returns#
- validbool
True if the fit status is valid.
- update(popt: ndarray, pcov: ndarray, chisquare: float, dof: int) None[source]#
Update the fit status, i.e., set the chisquare and calculate the corresponding p-value.
Arguments#
- poptarray_like
The optimal values for the fit parameters.
- pcovarray_like
The covariance matrix for the fit parameters.
- chisquarefloat
The chisquare of the fit.
- dofint
The number of degrees of freedom of the fit.
- class aptapy.modeling.AbstractFitModelBase(label: str = None, xlabel: str = None, ylabel: str = None)[source]#
Abstract base class for all the fit classes.
This is a acting a base class for both simple fit models and for composite models (e.g., sums of simple ones).
Arguments#
- labelstr, optional
The label for the model. If this is None, the model name is used as default, which makes sense because the name is how we would label a fit model in most circumstances.
- xlabelstr, optional
The label for the x-axis.
- ylabelstr, optional
The label for the y-axis.
- abstractmethod static evaluate(x: float | ndarray, *parameter_values: float) float | ndarray[source]#
Evaluate the model at a given set of parameter values.
Arguments#
- xarray_like
The value(s) of the independent variable.
- parameter_valuessequence of float
The value of the model parameters.
Returns#
- yarray_like
The value(s) of the model at the given value(s) of the independent variable for a given set of parameter values.
- _wrap_evaluate() Callable[source]#
Helper function to build a wrapper around the evaluate() method with the (correct) explicit signature, including all the parameter names.
This is used, e.g., by FitModelSum and GaussianForestBase to wrap the evaluate() method, which is expressed in terms of a generic signature, before the method itself is passed to the freeze() method.
- jacobian(x: float | ndarray, *parameter_values: float, eps: float = 1e-08) ndarray[source]#
Numerically calculate the Jacobian matrix of partial derivatives of the model with respect to the parameters.
This is used, e.g., to plot confidence bands around the best-fit model.
Arguments#
- xarray_like
The value(s) of the independent variable.
- parameter_valuessequence of float
The value of the model parameters. If no parameters are passed, the current values are used by default. Alternatively, all the model parameters must be passed, otherwise a ValueError is raised.
- epsfloat, optional
The step size to use for the numerical differentiation.
Returns#
- Jndarray
The Jacobian matrix of partial derivatives. The shape of the array is (m, n), where m is the number of data points where the Jacobian is calculated, and n the number of parameters.
- name() str[source]#
Return the model name, e.g., for legends.
Note this can be reimplemented in concrete subclasses, but it should provide a sensible default value in most circumstances.
Returns#
- namestr
The model name.
- init_parameters(xdata: float | ndarray, ydata: float | ndarray, sigma: float | ndarray) None[source]#
Optional hook to change the current parameter values of the model, prior to a fit, based on the input data.
Arguments#
- xdataarray_like
The input values of the independent variable.
- ydataarray_like
The input values of the dependent variable.
- sigmaarray_like
The input uncertainties on the dependent variable.
- parameter_values() Tuple[float][source]#
Return the current parameter values.
Note this only relies on the __iter__() method, so it works both for simple and composite models.
Returns#
- valuestuple of float
The current parameter values.
- free_parameters() Tuple[FitParameter][source]#
Return the list of free parameters.
Note this only relies on the __iter__() method, so it works both for simple and composite models.
Returns#
- parameterstuple of FitParameter
The list of free parameters.
- free_parameter_values() Tuple[float][source]#
Return the current parameter values.
Returns#
- valuestuple of float
The current parameter values.
- bounds() Tuple[float | ndarray, float | ndarray][source]#
Return the bounds on the fit parameters in a form that can be use by the fitting method.
Returns#
- bounds2-tuple of array_like
The lower and upper bounds on the (free) fit parameters.
- set_parameters(*parameter_values: float) None[source]#
Set the model parameters to the given values.
Arguments#
- parameter_valuessequence of float
The new values for the model parameters.
- update_parameters(popt: ndarray, pcov: ndarray) None[source]#
Update the model parameters based on the output of the
curve_fit()call.Arguments#
- poptarray_like
The optimal values for the fit parameters.
- pcovarray_like
The covariance matrix for the fit parameters.
- calculate_chisquare(xdata: ndarray, ydata: ndarray, sigma) float[source]#
Calculate the chisquare of the fit to some input data with the current model parameters.
Arguments#
- xdataarray_like
The input values of the independent variable.
- ydataarray_like
The input values of the dependent variable.
- sigmaarray_like
The input uncertainties on the dependent variable.
Returns#
- chisquarefloat
The chisquare of the fit.
- static freeze(model_function, **constraints) Callable[source]#
Freeze a subset of the model parameters.
Arguments#
- model_functioncallable
The model function to freeze parameters for.
- constraintsdict
The parameters to freeze, as keyword arguments.
Returns#
- wrappercallable
A wrapper around the model function with the given parameters frozen.
- fit(xdata: float | ndarray | Histogram1d, ydata: float | ndarray = None, *, p0: float | ndarray = None, sigma: float | ndarray = None, absolute_sigma: bool = False, xmin: float = -inf, xmax: float = inf, **kwargs) FitStatus[source]#
Fit a series of points.
Arguments#
- xdataarray_like or one-dimensional histogram
The input values of the independent variable or a 1-dimensional histogram.
- ydataarray_like, optional
The input values of the dependent variable.
- p0array_like, optional
The initial values for the fit parameters.
- sigmaarray_like, optional
The input uncertainties on the dependent variable.
- absolute_sigmabool, optional (default False)
See the curve_fit() documentation for details.
- xminfloat, optional (default -inf)
The minimum value of the independent variable to fit. Note that if xmin < xmax the (xmax, xmin) interval is excluded from the fit.
- xmaxfloat, optional (default inf)
The maximum value of the independent variable to fit. Note that if xmin < xmax the (xmax, xmin) interval is excluded from the fit.
- kwargsdict, optional
Additional keyword arguments passed to curve_fit().
Returns#
- statusFitStatus
The status of the fit.
- static default_plotting_range() Tuple[float, float][source]#
Return the default plotting range for the model.
This can be reimplemented in concrete models, and can be parameter-dependent (e.g., for a gaussian we might want to plot within 5 sigma from the mean by default). And if you think for a moment to move this to a
DEFAULT_PLOTTING_RANGEclass variable, keep in mind that having it as a method allows for parameter-dependent default ranges.Returns#
- Tuple[float, float]
The default plotting range for the model.
- set_plotting_range(xmin: float, xmax: float) None[source]#
Set a custom plotting range for the model.
Arguments#
- xminfloat
The minimum x value for plotting.
- xmaxfloat
The maximum x value for plotting.
- plotting_range() Tuple[float, float][source]#
Return the current plotting range for the model.
If a custom plotting range has been set via set_plotting_range(), or as a part of a fit, that is returned, otherwise the default plotting range for the model is used.
Returns#
- Tuple[float, float]
The plotting range for the model.
- _plotting_grid() ndarray[source]#
Return the grid of x values to use for plotting the model.
Returns#
- xnp.ndarray
The x values used for plotting the model.
- _render(axes: Axes = None, **kwargs) None[source]#
Render the model on the given axes.
Arguments#
- axesmatplotlib.axes.Axes, optional
The axes to plot on (default: current axes).
- kwargsdict, optional
Additional keyword arguments passed to axes.plot().
- plot(axes: Axes = None, fit_output: bool = False, **kwargs) Axes[source]#
Plot the model.
Arguments#
- axesmatplotlib.axes.Axes, optional
The axes to plot on (default: current axes).
- kwargsdict, optional
Additional keyword arguments passed to plt.plot().
- confidence_band(x: float | ndarray, num_sigma: float = 1.0) ndarray[source]#
Return the vertical width of the n-sigma confidence band at the given x values.
Note this assumes that the model has been fitted to data and is equipped with a valid FitStatus. A RuntimeError is raised if that is not the case.
Arguments#
- xarray_like
The x values where the confidence delta is calculated.
- num_sigmafloat
The number of sigmas for the band (default 1).
Returns#
- deltanp.ndarray
The vertical width of the n-sigma confidence band at the given x values.
- plot_confidence_band(axes: Axes = None, num_sigma: float = 1.0, **kwargs) Axes[source]#
Plot the n-sigma confidence band around the best-fit model.
Arguments#
- axesmatplotlib.axes.Axes, optional
The axes to plot on (default: current axes).
- num_sigmafloat, optional
The number of sigmas for the confidence band (default: 1).
- kwargsdict, optional
Additional keyword arguments passed to axes.fill_between().
Returns#
- matplotlib.axes.Axes
The axes with the confidence band plotted.
- random_fit_dataset(sigma: float | ndarray, num_points: int = 25, seed: int = None) Tuple[ndarray, ndarray][source]#
Generate a random sample from the model, adding gaussian noise.
Arguments#
- sigmaarray_like
The standard deviation of the gaussian noise to add to the model.
- num_pointsint, optional
The number of points to generate (default 25).
- seedint, optional
The random seed to use (default None).
Returns#
- xdatanp.ndarray
The x values of the random sample.
- ydatanp.ndarray
The y values of the random sample.
- rvs(size: int = 1, random_state=None)[source]#
Generate random variates from the underlying distribution at the current parameter values.
Arguments#
- sizeint, optional
The number of random variates to generate (default 1).
- random_stateint or np.random.Generator, optional
The random seed or generator to use (default None).
- random_histogram(edges: ndarray, size: int, random_state=None) Histogram1d[source]#
Generate a histogram filled with random variates from the underlying distribution at the current parameter values.
Arguments#
- edgesnp.ndarray
The bin edges of the histogram.
- sizeint, optional
The number of random variates to generate (default 100000).
- random_stateint or np.random.Generator, optional
The random seed or generator to use (default None).
Returns#
- Histogram1d
A histogram filled with random variates from the distribution.
- _format_fit_output(spec: str) str[source]#
String formatting for fit output.
Arguments#
- specstr
The format specification.
Returns#
- textstr
The formatted string.
- _abc_impl = <_abc._abc_data object>#
- class aptapy.modeling.AbstractFitModel(label: str = None, xlabel: str = None, ylabel: str = None)[source]#
Abstract base class for a fit model.
- classmethod _parameter_dict() Dict[str, FitParameter][source]#
Return a dictionary of all the FitParameter objects defined in the class and its base classes.
This is a subtle one, as what we really want, here, is all members of a class (including inherited ones) that are of a specific type (FitParameter), in the order they were defined. All of these thing are instrumental to make the fit model work, so we need to be careful.
Also note the we are looping over the MRO in reverse order, so that we preserve the order of definition of the parameters, even when they are inherited from base classes. If a parameter is re-defined in a derived class, the derived class definition takes precedence, as we are using a dictionary to collect the parameters.
Arguments#
- clstype
The class to inspect.
Returns#
- param_dictdict
A dictionary mapping parameter names to their FitParameter objects.
- quadrature(x1: float, x2: float) float[source]#
Calculate the integral of the model between x1 and x2 using numerical integration.
Arguments#
- x1float
The minimum value of the independent variable to integrate over.
- x2float
The maximum value of the independent variable to integrate over.
Returns#
- integralfloat
The integral of the model between x1 and x2.
- integral(x1: float, x2: float) float[source]#
Default implementation of the integral of the model between x1 and x2. Subclasses can (and are encouraged to) overload this method with an analytical implementation, when available.
Arguments#
- x1float
The minimum value of the independent variable to integrate over.
- x2float
The maximum value of the independent variable to integrate over.
Returns#
- integralfloat
The integral of the model between x1 and x2.
- _abc_impl = <_abc._abc_data object>#
- class aptapy.modeling.AbstractSigmoidFitModel(label: str = None, xlabel: str = None, ylabel: str = None)[source]#
Abstract base class for fit models representing sigmoids.
- amplitude = FitParameter(value=1.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#
- location = FitParameter(value=0.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#
- scale = FitParameter(value=1.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#
- abstractmethod static shape(z: float | ndarray, *parameter_values: float) float | ndarray[source]#
Abstract method for the normalized shape of the sigmoid model. Subclasses must implement this method.
Arguments#
- zarray_like
The normalized independent variable.
- parameter_valuesfloat
Additional shape parameters for the sigmoid.
Returns#
- array_like
The value of the sigmoid shape function at z.
- evaluate(x: float | ndarray, amplitude: float, location: float, scale: float, *parameter_values: float) float | ndarray[source]#
Overloaded method for evaluating the model.
Note if the scale is negative, we take the complement of the sigmoid function.
- init_parameters(xdata: float | ndarray, ydata: float | ndarray, sigma: float | ndarray = 1.0)[source]#
Overloaded method.
- default_plotting_range() Tuple[float, float][source]#
Overloaded method.
By default the plotting range is set to be an interval centered on the location parameter, and extending for a number of scale units on each side.
- _abc_impl = <_abc._abc_data object>#
- class aptapy.modeling.AbstractCRVFitModel(label: str = None, xlabel: str = None, ylabel: str = None)[source]#
Abstract base class for fit models based on continuous random variables.
(Typically we will use this, in conjunction with the wrap_rv_continuous decorator, to wrap continuous random variables from scipy.stats).
The general rule for the signature of scipy distributions is that they accept all the shape parameters first, and then loc and scale. This decorator creates a fit model class with the appropriate methods to Read dist.shapes (and numargs) to know the positional shape args. Assume loc and scale keywords are always supported.
- amplitude = FitParameter(value=1.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#
- location = FitParameter(value=0.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#
- scale = FitParameter(value=1.0, _name=None, error=None, _frozen=False, minimum=0, maximum=inf)#
- _rv = None#
- classmethod evaluate(x, amplitude, location, scale, *args)[source]#
Overloaded method for evaluating the model.
This takes the pdf of the underlying distribution and scales it by the amplitude.
- classmethod primitive(x, amplitude, location, scale, *args)[source]#
Overloaded method for evaluating the primitive of the model.
Note this is not just a primitive, it is the actual cumulative distribution function (cdf) scaled by the amplitude. We keep the
primitive()name for because in general not all the fit models are normalizable, and still we want to keep a common interface.
- support()[source]#
Return the support of the underlying distribution at the current parameter values.
- ppf(p: float | ndarray)[source]#
Return the percent point function (inverse of cdf) of the underlying distribution for a given quantile at the current parameter values.
Arguments#
- parray_like
The quantile(s) to evaluate the ppf at.
- std()[source]#
Return the standard deviation of the underlying distribution at the current parameter values.
- rvs(size: int = 1, random_state=None)[source]#
Generate random variates from the underlying distribution at the current parameter values.
Arguments#
- sizeint, optional
The number of random variates to generate (default 1).
- random_stateint or np.random.Generator, optional
The random seed or generator to use (default None).
- init_parameters(xdata: float | ndarray, ydata: float | ndarray, sigma: float | ndarray = 1.0) None[source]#
Overloaded method.
This is tailored on unimodal distributions, where we start from the basic statistics (average, standard deviation and area) of the input sample and try to match the amplitude, location and scale of the distribution to be fitted. No attempt is made at setting the shape parameters (if any).
- default_plotting_range() Tuple[float, float][source]#
Overloaded method.
Note we have access to all the goodies of a scipy.stats.rv_continuous object here (e.g., the support of the function, and the mean and standard deviation when they are finite), so we can be fairly clever in setting up a generic method that works out of the box in many cases.
- plot(axes: Axes = None, fit_output: bool = False, plot_mean: bool = True, **kwargs) None[source]#
Plot the model.
Note this is reimplemented from scratch to allow overplotting the mean of the distribution.
Arguments#
- axesmatplotlib.axes.Axes, optional
The axes to plot on (default: current axes).
- fit_outputbool, optional
Whether to include the fit output in the legend (default: False).
- plot_meanbool, optional
Whether to overplot the mean of the distribution (default: True).
- kwargsdict, optional
Additional keyword arguments passed to plt.plot().
- _abc_impl = <_abc._abc_data object>#
- class aptapy.modeling.PhonyCRVFitModel(scipy_version: str)[source]#
Phony class to provide a mechanism not to break everything when a particular scipy.stats distribution is not available in a given scipy version.
- aptapy.modeling.wrap_rv_continuous(rv, **shape_parameters) type[source]#
Decorator to wrap a scipy.stats.rv_continuous object into a fit model.
This is fairly minimal, and basically accounts to adding all the necessary shape parameters to the underlying fit model class. Note the name of the parameters is inferred from the rv.shapes attribute, and each shape parameter is set to 1. by default (with a minimum of 0.) unless this is overridden via the shape_parameters argument.
Arguments#
- rvscipy.stats.rv_continuous
The scipy.stats.rv_continuous object to wrap.
- shape_parametersdict, optional
Additional shape parameters to be setup with non-default FitParameter objects (e.g., to set different minimum/maximum values).
- aptapy.modeling.line_forest(*energies: float) Callable[[type], type][source]#
Decorator to build a line forest fit model.
A line forest is a collection of spectral lines at known energies, each with an independent amplitude, all sharing a common energy scale and with a line width (sigma) that scales as the square root of the line energy.
This decorator is simply adding a class attribute to store the line energies, and creating all the necessary FitParameter objects.
While the decorator is agnostic as to what is the actual line shape, the GaussianForestBase class is a good example of how to use this decorator to build a line forest fit model.
Arguments#
- energiesfloat
The energies of the lines comprised in the forest. (These are typically provided in physical units, e.g., keV, whereas the energy scale parameters determines the conversion between the energy and whatever units the fit model is actually evaluated in. e.g., ADC counts).
- class aptapy.modeling.GaussianForestBase(label: str = None, xlabel: str = None, ylabel: str = None)[source]#
Abstract base model representing a forest of Gaussian spectral lines at fixed energies.
Concrete models needs to be decorated with the @line_forest decorator, specifying the energies of the lines included in the forest.
Each peak corresponds to a known energy, and the model allows for fitting the amplitudes, a global energy scale, and a common width (sigma) that scales as the square root of the line energy, as it is common to observe in particle detectors.
- evaluate(x: float | ndarray, *parameter_values) float | ndarray[source]#
Evaluate the model at a given set of parameter values.
Arguments#
- xarray_like
The value(s) of the independent variable.
- parameter_valuessequence of float
The value of the model parameters.
Returns#
- yarray_like
The value(s) of the model at the given value(s) of the independent variable for a given set of parameter values.
- _intensities()[source]#
Return the current values of the line intensities for the forest, properly normalized to one.
- rvs(size: int = 1, random_state=None)[source]#
Generate random variates from the underlying distribution at the current parameter values.
Arguments#
- sizeint, optional
The number of random variates to generate (default 1).
- random_stateint or np.random.Generator, optional
The random seed or generator to use (default None).
- init_parameters(xdata: float | ndarray, ydata: float | ndarray, sigma: float | ndarray = 1.0) None[source]#
Overloaded method.
- fit_iterative(xdata: float | ndarray | Histogram1d, ydata: float | ndarray = None, *, p0: float | ndarray = None, sigma: float | ndarray = None, num_sigma_left: float = 2.0, num_sigma_right: float = 2.0, num_iterations: int = 2, **kwargs) FitStatus[source]#
Fit iteratively line forest spectrum data within a given number of sigma around the peaks.
This function performs a first round of fit to the data (either a histogram or scatter plot data) and then repeats the fit iteratively, limiting the fit range to a specified interval defined in terms of deviations (in sigma) around the peaks.
Arguments#
- xdataarray_like or Histogram1d
The data (scatter plot x values) or histogram to fit.
- ydataarray_like, optional
The y data to fit (if xdata is not a Histogram1d).
- p0array_like, optional
The initial values for the fit parameters.
- sigmaarray_like, optional
The uncertainties on the y data.
- num_sigma_leftfloat
The number of sigma on the left of the first peak to be used to define the fitting range.
- num_sigma_rightfloat
The number of sigma on the right of the last peak to be used to define the fitting range.
- num_iterationsint
The number of iterations of the fit.
- kwargsdict, optional
Additional keyword arguments passed to fit().
Returns#
- FitStatus
The results of the fit.
- plot(axes: Axes = None, fit_output: bool = False, plot_components: bool = True, **kwargs) Axes[source]#
Overloaded method for plotting the model.
Arguments#
- axesmatplotlib.axes.Axes, optional
The axes on which to plot the model. If None, uses the current axes.
- fit_outputbool, optional
If True, displays the fit output on the legend. Default is False.
- plot_componentsbool, optional
If True, plots the individual components of the model as dashed lines. Default is True.
- kwargs
Additional keyword arguments passed to the parent class.
Returns#
None
- _abc_impl = <_abc._abc_data object>#
- class aptapy.modeling.FitModelSum(*components: AbstractFitModel)[source]#
Composite model representing the sum of an arbitrary number of simple models.
Arguments#
- componentssequence of AbstractFitModel
The components of the composite model.
- freeze(model_function, **constraints) Callable[source]#
Overloaded method.
This is a tricky one, for two distinct reasons: (i) for a FitModelSum object evaluate() is not a static method, as it needs to access the list of components to sum over; (ii) since components can be added at runtime, the original signature of the function is generic, so we need to build a new signature that reflects the actual parameters of the model when we actually want to use it in a fit. In order to make this work, when freezing parameters we build a wrapper around evaluate() with the correct signature, and pass it downstream to the static freeze() method of the parent class AbstractFitModel.
- evaluate(x: float | ndarray, *parameter_values) float | ndarray[source]#
Overloaded method for evaluating the model.
Note this is not a static method, as we need to access the list of components to sum over.
- integral(x1: float, x2: float) float[source]#
Calculate the integral of the model between x1 and x2.
This is implemented as the sum of the integrals of the components.
Arguments#
- x1float
The minimum value of the independent variable to integrate over.
- x2float
The maximum value of the independent variable to integrate over.
Returns#
- integralfloat
The integral of the model between x1 and x2.
- plot(axes: Axes = None, fit_output: bool = False, plot_components: bool = True, **kwargs) Axes[source]#
Overloaded method for plotting the model.
Arguments#
- axesmatplotlib.axes.Axes, optional
The axes on which to plot the model. If None, uses the current axes.
- fit_outputbool, optional
If True, displays the fit output on the legend. Default is False.
- plot_componentsbool, optional
If True, plots the individual components of the model as dashed lines. Default is True.
- kwargs
Additional keyword arguments passed to the parent class.
Returns#
None
- _format_fit_output(spec: str) str[source]#
String formatting for fit output.
Arguments#
- specstr
The format specification.
Returns#
- textstr
The formatted string.
- _abc_impl = <_abc._abc_data object>#