`modeling` — Modeling#

The modeling module provides all the core tools for fitting models to data, including parameter estimation and uncertainty quantification. All predefined simple models are defined in the models module, which builds on top of the functionality provided here.

More complex models can be built by summing simple ones, e.g.,

>>> from aptapy.models import Line, Gaussian
>>>
>>> model = Line() + Gaussian()

The main fitting engine supports bounded fits and/or fits with fixed parameters.

Parameters#

The first central concept in the modeling module is that of a fit parameter, represented by the FitParameter class. A fit parameter is a named mutable object that holds a value, an optional uncertainty, and optional bounds, along with a flag that indicate whether they should be varied or not in a fit.

FitParameter objects provide all the facilities for pretty-printing their value and uncertainty. The following example shows the basic semantics of the class:

>>> from aptapy.modeling import FitParameter
>>>
>>> param = FitParameter(1.0, "amplitude", error=0.1)
>>> print(param)
Amplitude: 1.0 ± 0.1

Fit status#

FitStatus is a small bookkeeping class that holds all the information about the status of a fit, such as the chisquare, the number of degrees of freedom and the fit range.

Warning

At this point the implementation of the class is fairly minimal, and it is very likely that we will be adding stuff along the way.

Simple models#

Chances are you will not have to interact with FitParameter and FitStatus objects a lot, but they are central to defining and using simple fit models, and heavily used internally.

The easiest way to see how you would go about defining an actual fit model is to look at the source code for a simple one.

class Line(AbstractFitModel):

    """Linear model.
    """

    slope = FitParameter(1.)
    intercept = FitParameter(0.)

    @staticmethod
    def evaluate(x: ArrayLike, slope: float, intercept: float) -> ArrayLike:
        # pylint: disable=arguments-differ
        return slope * x + intercept

    def init_parameters(self, xdata: ArrayLike, ydata: ArrayLike, sigma: ArrayLike = 1.) -> None:
        """Overloaded method.

        This is simply using a weighted linear regression.

        .. note::

           This should provide the exact result in most cases, but, in the spirit of
           providing a common interface across all models, we are not overloading the
           fit() method. (Everything will continue working as expected, e.g., when
           one uses bounds on parameters.)
        """
        # pylint: disable=invalid-name
        if isinstance(sigma, Number):
            sigma = np.full(ydata.shape, sigma)
        weights = 1. / sigma**2.
        S0x = weights.sum()
        S1x = (weights * xdata).sum()
        S2x = (weights * xdata**2.).sum()
        S0xy = (weights * ydata).sum()
        S1xy = (weights * xdata * ydata).sum()
        D = S0x * S2x - S1x**2.
        if D != 0.:
            self.slope.init((S0x * S1xy - S1x * S0xy) / D)
            self.intercept.init((S2x * S0xy - S1x * S1xy) / D)

    def primitive(self, x: ArrayLike) -> ArrayLike:
        slope, intercept = self.parameter_values()
        return 0.5 * slope * x**2 + intercept * x

All we really have to do is to subclass AbstractFitModel, listing all the fit parameters as class attributes (assigning them sensible default values), and implement the evaluate() method, which takes as first argument the independent variable and then the values of all the fit parameters.

Note

It goes without saying that the order of the fit parameters in the argument list of the evaluate() method must match the order in which they are defined as class attributes.

In this particular case we are sayng that the Line model has two fit parameters, intercept and slope, and, well, the model itself evaluates as a straight line as we would expect.

When we create an instance of a fitting model

>>> model = Line()

a few things happen under the hood:

the class instance gets its own copy of each fit parameter, so that we can change their values and settings without affecting the class definition, nor other class instances;
the class instance registers the fit parameters as attributes of the instance, so that we can access them as, e.g., model.intercept, model.slope.

That it’s pretty much it. The next thing that you proabably want to do is to fit the model to a series of data points, which you do in pretty much the same fashion as you would do with scipy.optimize.curve_fit using the fit() method. This will return a FitStatus() object containing information about the fit.

Fitting primer#

Assuming that you have a set of data points xdata, ydata, the latter with associated uncertainties yerrors, the simplest fit goes like

>>> from aptapy.modeling import Line
>>>
>>> model = Line()
>>> status = model.fit(xdata, ydata, sigma=yerrors)

You can fit within a subrange of the input data by specifying the min and/or the max keyword arguments:

>>> from aptapy.modeling import Line
>>>
>>> model = Line()
>>> status = model.fit(xdata, ydata, sigma=yerrors, xmin=0., xmax=10.)

You can set bounds on the fit parameters, e.g., force the slope to be positive by doing

>>> from aptapy.modeling import Line
>>>
>>> model = Line()
>>> model.slope.minimum = 0.
>>> status = model.fit(xdata, ydata, sigma=yerrors)

and you can freeze any of the parameters to a fixed value during the fit

>>> from aptapy.modeling import Line
>>>
>>> model = Line()
>>> model.intercept.freeze(0.)
>>> status = model.fit(xdata, ydata, sigma=yerrors)

Or, really, any linear combination of the above. The fit status is able to pretty-print itself, and the fitted model can be plotted by just doing

>>> model.plot()
>>> plt.legend()

(The legend bit is put there on purpose, as by default the fitting model will add a nice entry in the legend with all the relevant information.)

Fitting models interact nicely with one-dimensional histograms from the hist, module so you can do

>>> import numpy as np
>>> from aptapy.hist import Histogram1D
>>> from aptapy.modeling import Line
>>>
>>> hist = Histogram1D(np.linspace(0., 1., 100))
>>> hist.fill(np.random.rand(1000))
>>> model = Line()
>>> status = model.fit(hist)

Location-scale models#

A number of models included in this module belong to the location-scale family, i.e., they can be expressed in terms of a location parameter and a non-negative scale parameter and, ultimately, are characterized by a universal shape function \(g(z)\) of the standardized variable

\[z = \frac{x - m}{s},\]

where \(m\) is the location parameter and \(s\) is the scale parameter. The gaussian probability density function is the prototypical example of location-scale model (with the mean as location and the standard deviation as scale), but many other models belong to this family—both peak-like and sigmoid-like.

From the point of view of the practical implementation, most of the location-scale models in aptapy.models wrap scipy.stats.rv_continuous distributions from scipy.stats, which already provide most of the necessary functionality.

Sigmoid models#

Location-scale sigmoid-like models all inherit from the abstract base class AbstractSigmoidFitModel, and, just like in the previous case, they must provide a concrete implementation of the shape function. The latter is expected to be a monotonically increasing function, ranging from 0 to 1 as its argument goes from -infinity to +infinity, and the meaning of the amplitude parameter in this case is that of the total change in the function value across the transition region, so that the general model reads

\[f(x; A, m, s, ...) = A g\left(\frac{x - m}{s}; ...\right)\]

(note that there is no division by the scale parameter in this case). Conversely, the implementation of the evaluation method reads:

    def evaluate(self, x: ArrayLike, amplitude: float, location: float,
                 scale: float, *parameter_values: float) -> ArrayLike:
        """Overloaded method for evaluating the model.

        Note if the scale is negative, we take the complement of the sigmoid function.
        """
        # pylint: disable=arguments-differ
        z = (x - location) / abs(scale)
        val = amplitude * self.shape(z, *parameter_values)
        return val if scale >= 0. else amplitude - val

Gaussian forests#

The GaussianForestBase class provides a base for implementing models made of a forest of Gaussian peaks with fixed energies. An example of such a model is provided by the Fe55Forest class, which implements a forest of Gaussian peaks for the Kα and Kβ emission features of \(^{55}\mathrm{Fe}\) decay.

Model of a forest of Gaussian peaks with fixed energies \(E_i\). Each peak has amplitude \(A_i\), while all peaks share a global energy scale \(E_s\) and a common width \(\sigma\), scaled as \(1/\sqrt{E_i / E_0}\).

\[g(z) = \sum_i A_i \exp \left[- \frac{(z - E_i / E_s)^2}{(\sigma / \sqrt{E_i / E_0})^2} \right]\]

Note

Base class used by models generated with aptapy.modeling.line_forest().

Composite models#

The modeling module also provides a way to build composite models by summing simple ones. This is achieved by means of the FitModelSum, which is design to hold a list of components and interoperate with the rest of the world in exactly the same fashion as simple models.

Chances are you will never have to instantiate a FitModelSum object directly, as the + operator will do the trick in most of the cases, e.g.,

>>> from aptapy.modeling import Line, Gaussian
>>> model = Line() + Gaussian()
>>> status = model.fit(xdata, ydata, sigma=yerrors)

Module documentation#

Modeling core facilities.

class aptapy.modeling.Format(*values)[source]#

Small enum class to control string formatting.

This is leveraging the custom formatting of the uncertainties package, where a trailing P means “pretty print” and a trailing L means “LaTeX”.

PRETTY = 'P'#

LATEX = 'L'#

class aptapy.modeling.FitParameter(value: float, _name: str = None, error: float = None, _frozen: bool = False, minimum: float = -inf, maximum: float = inf)[source]#

Small class describing a fit parameter.

value: float#

_name: str = None#

error: float = None#

_frozen: bool = False#

minimum: float = -inf#

maximum: float = inf#

property name: str#

Return the parameter name.

We are wrapping this into a property because, arguably, the parameter name is the only thing we never, ever want to change after the fact.

Returns#

namestr: The parameter name.

property frozen: bool#

Return True if the parameter is frozen.

We are wrapping this into a property because we interact with this member via the freeze() and thaw() methods.

Returns#

frozenbool: True if the parameter is frozen.

is_bound() → bool[source]#

Return True if the parameter is bounded.

Returns#

boundedbool: True if the parameter is bounded.

copy(name: str) → FitParameter[source]#

Create a copy of the parameter object with a new name.

This is necessary because we define the fit parameters of the actual model as class variables holding the default value, and each instance gets their own copy of the parameter, where the name is automatically inferred.

Note that, in addition to the name being passed as an argument, we only carry over the value and bounds of the original fit parameter: the new object is created with error = None and _frozen = False.

Arguments#

namestr: The name for the new FitParameter object.

Returns#

parameterFitParameter: The new FitParameter object.

set(value: float, error: float = None) → None[source]#

Set the parameter value and error.

Arguments#

valuefloat: The new value for the parameter.
errorfloat, optional: The new error for the parameter (default None).

init(value: float) → None[source]#

Initialize the fit parameter to a given value, unless it is frozen, or the value is out of bounds.

Warning

Note this silently does nothing if the parameter is frozen, or if the value is out of bounds, so its behavior is inconsistent with that of set(), which raises an exception in both cases. This is intentional, and this method should only be used to initialize the parameter prior to a fit.

Arguments#

valuefloat: The new value for the parameter.

freeze(value: float) → None[source]#

Freeze the fit parameter to a given value.

Note that the error is set to None.

Arguments#

valuefloat: The new value for the parameter.

thaw() → None[source]#: Un-freeze the fit parameter.

ufloat() → ufloat[source]#

Return the parameter value and error as a ufloat object.

Returns#

ufloatuncertainties.ufloat: The parameter value and error as a ufloat object.

pull(expected: float) → float[source]#

Calculate the pull of the parameter with respect to a given expected value.

Arguments#

expectedfloat: The expected value for the parameter.

Returns#

pullfloat: The pull of the parameter with respect to the expected value, defined as (value - expected) / error.

Raises#

RuntimeError: If the parameter has no error associated to it.

compatible_with(expected: float, num_sigma: float = 3.0) → bool[source]#

Check if the parameter is compatible with an expected value within n_sigma.

Arguments#

expectedfloat: The expected value for the parameter.
num_sigmafloat, optional: The number of sigmas to use for the compatibility check (default 3).

Returns#

compatiblebool: True if the parameter is compatible with the expected value within num_sigma.

class aptapy.modeling.FitStatus(popt: ndarray = None, pcov: ndarray = None, chisquare: float = None, dof: int = None, pvalue: float = None, correlated_pars: ndarray = None)[source]#

Small dataclass to hold the fit status.

popt: ndarray = None#

pcov: ndarray = None#

chisquare: float = None#

dof: int = None#

pvalue: float = None#

correlated_pars: ndarray = None#

reset() → None[source]#: Reset the fit status.

valid() → bool[source]#

Return True if the fit status is valid, i.e., all the fields are set.

Returns#

validbool: True if the fit status is valid.

update(popt: ndarray, pcov: ndarray, chisquare: float, dof: int) → None[source]#

Update the fit status, i.e., set the chisquare and calculate the corresponding p-value.

Arguments#

poptarray_like: The optimal values for the fit parameters.
pcovarray_like: The covariance matrix for the fit parameters.
chisquarefloat: The chisquare of the fit.
dofint: The number of degrees of freedom of the fit.

class aptapy.modeling.AbstractFitModelBase(label: str = None, xlabel: str = None, ylabel: str = None)[source]#

Abstract base class for all the fit classes.

This is a acting a base class for both simple fit models and for composite models (e.g., sums of simple ones).

Arguments#

labelstr, optional: The label for the model. If this is None, the model name is used as default, which makes sense because the name is how we would label a fit model in most circumstances.
xlabelstr, optional: The label for the x-axis.
ylabelstr, optional: The label for the y-axis.

abstractmethod static evaluate(x: float | ndarray, *parameter_values: float) → float | ndarray[source]#

Evaluate the model at a given set of parameter values.

Arguments#

xarray_like: The value(s) of the independent variable.
parameter_valuessequence of float: The value of the model parameters.

Returns#

yarray_like: The value(s) of the model at the given value(s) of the independent variable for a given set of parameter values.

_wrap_evaluate() → Callable[source]#

Helper function to build a wrapper around the evaluate() method with the (correct) explicit signature, including all the parameter names.

This is used, e.g., by FitModelSum and GaussianForestBase to wrap the evaluate() method, which is expressed in terms of a generic signature, before the method itself is passed to the freeze() method.

jacobian(x: float | ndarray, *parameter_values: float, eps: float = 1e-08) → ndarray[source]#

Numerically calculate the Jacobian matrix of partial derivatives of the model with respect to the parameters.

This is used, e.g., to plot confidence bands around the best-fit model.

Arguments#

xarray_like: The value(s) of the independent variable.
parameter_valuessequence of float: The value of the model parameters. If no parameters are passed, the current values are used by default. Alternatively, all the model parameters must be passed, otherwise a ValueError is raised.
epsfloat, optional: The step size to use for the numerical differentiation.

Returns#

Jndarray: The Jacobian matrix of partial derivatives. The shape of the array is (m, n), where m is the number of data points where the Jacobian is calculated, and n the number of parameters.

name() → str[source]#

Return the model name, e.g., for legends.

Note this can be reimplemented in concrete subclasses, but it should provide a sensible default value in most circumstances.

Returns#

namestr: The model name.

init_parameters(xdata: float | ndarray, ydata: float | ndarray, sigma: float | ndarray) → None[source]#

Optional hook to change the current parameter values of the model, prior to a fit, based on the input data.

Arguments#

xdataarray_like: The input values of the independent variable.
ydataarray_like: The input values of the dependent variable.
sigmaarray_like: The input uncertainties on the dependent variable.

parameter_values() → Tuple[float][source]#

Return the current parameter values.

Note this only relies on the __iter__() method, so it works both for simple and composite models.

Returns#

valuestuple of float: The current parameter values.

free_parameters() → Tuple[FitParameter][source]#

Return the list of free parameters.

Note this only relies on the __iter__() method, so it works both for simple and composite models.

Returns#

parameterstuple of FitParameter: The list of free parameters.

free_parameter_values() → Tuple[float][source]#

Return the current parameter values.

Returns#

valuestuple of float: The current parameter values.

bounds() → Tuple[float | ndarray, float | ndarray][source]#

Return the bounds on the fit parameters in a form that can be use by the fitting method.

Returns#

bounds2-tuple of array_like: The lower and upper bounds on the (free) fit parameters.

set_parameters(*parameter_values: float) → None[source]#

Set the model parameters to the given values.

Arguments#

parameter_valuessequence of float: The new values for the model parameters.

update_parameters(popt: ndarray, pcov: ndarray) → None[source]#

Update the model parameters based on the output of the curve_fit() call.

Arguments#

poptarray_like: The optimal values for the fit parameters.
pcovarray_like: The covariance matrix for the fit parameters.

calculate_chisquare(xdata: ndarray, ydata: ndarray, sigma) → float[source]#

Calculate the chisquare of the fit to some input data with the current model parameters.

Arguments#

xdataarray_like: The input values of the independent variable.
ydataarray_like: The input values of the dependent variable.
sigmaarray_like: The input uncertainties on the dependent variable.

Returns#

chisquarefloat: The chisquare of the fit.

static freeze(model_function, **constraints) → Callable[source]#

Freeze a subset of the model parameters.

Arguments#

model_functioncallable: The model function to freeze parameters for.
constraintsdict: The parameters to freeze, as keyword arguments.

Returns#

wrappercallable: A wrapper around the model function with the given parameters frozen.

Fit a series of points.

Arguments#

xdataarray_like or one-dimensional histogram: The input values of the independent variable or a 1-dimensional histogram.
ydataarray_like, optional: The input values of the dependent variable.
p0array_like, optional: The initial values for the fit parameters.
sigmaarray_like, optional: The input uncertainties on the dependent variable.
absolute_sigmabool, optional (default False): See the curve_fit() documentation for details.
xminfloat, optional (default -inf): The minimum value of the independent variable to fit. Note that if xmin < xmax the (xmax, xmin) interval is excluded from the fit.
xmaxfloat, optional (default inf): The maximum value of the independent variable to fit. Note that if xmin < xmax the (xmax, xmin) interval is excluded from the fit.
kwargsdict, optional: Additional keyword arguments passed to curve_fit().

Returns#

statusFitStatus: The status of the fit.

static default_plotting_range() → Tuple[float, float][source]#

Return the default plotting range for the model.

This can be reimplemented in concrete models, and can be parameter-dependent (e.g., for a gaussian we might want to plot within 5 sigma from the mean by default). And if you think for a moment to move this to a DEFAULT_PLOTTING_RANGE class variable, keep in mind that having it as a method allows for parameter-dependent default ranges.

Returns#

Tuple[float, float]: The default plotting range for the model.

set_plotting_range(xmin: float, xmax: float) → None[source]#

Set a custom plotting range for the model.

Arguments#

xminfloat: The minimum x value for plotting.
xmaxfloat: The maximum x value for plotting.

plotting_range() → Tuple[float, float][source]#

Return the current plotting range for the model.

If a custom plotting range has been set via set_plotting_range(), or as a part of a fit, that is returned, otherwise the default plotting range for the model is used.

Returns#

Tuple[float, float]: The plotting range for the model.

_plotting_grid() → ndarray[source]#

Return the grid of x values to use for plotting the model.

Returns#

xnp.ndarray: The x values used for plotting the model.

_render(axes: Axes = None, **kwargs) → None[source]#

Render the model on the given axes.

Arguments#

axesmatplotlib.axes.Axes, optional: The axes to plot on (default: current axes).
kwargsdict, optional: Additional keyword arguments passed to axes.plot().

plot(axes: Axes = None, fit_output: bool = False, **kwargs) → Axes[source]#

Plot the model.

Arguments#

axesmatplotlib.axes.Axes, optional: The axes to plot on (default: current axes).
kwargsdict, optional: Additional keyword arguments passed to plt.plot().

confidence_band(x: float | ndarray, num_sigma: float = 1.0) → ndarray[source]#

Return the vertical width of the n-sigma confidence band at the given x values.

Note this assumes that the model has been fitted to data and is equipped with a valid FitStatus. A RuntimeError is raised if that is not the case.

Arguments#

xarray_like: The x values where the confidence delta is calculated.
num_sigmafloat: The number of sigmas for the band (default 1).

Returns#

deltanp.ndarray: The vertical width of the n-sigma confidence band at the given x values.

plot_confidence_band(axes: Axes = None, num_sigma: float = 1.0, **kwargs) → Axes[source]#

Plot the n-sigma confidence band around the best-fit model.

Arguments#

axesmatplotlib.axes.Axes, optional: The axes to plot on (default: current axes).
num_sigmafloat, optional: The number of sigmas for the confidence band (default: 1).
kwargsdict, optional: Additional keyword arguments passed to axes.fill_between().

Returns#

matplotlib.axes.Axes: The axes with the confidence band plotted.

random_fit_dataset(sigma: float | ndarray, num_points: int = 25, seed: int = None) → Tuple[ndarray, ndarray][source]#

Generate a random sample from the model, adding gaussian noise.

Arguments#

sigmaarray_like: The standard deviation of the gaussian noise to add to the model.
num_pointsint, optional: The number of points to generate (default 25).
seedint, optional: The random seed to use (default None).

Returns#

xdatanp.ndarray: The x values of the random sample.
ydatanp.ndarray: The y values of the random sample.

rvs(size: int = 1, random_state=None)[source]#

Generate random variates from the underlying distribution at the current parameter values.

Arguments#

sizeint, optional: The number of random variates to generate (default 1).
random_stateint or np.random.Generator, optional: The random seed or generator to use (default None).

random_histogram(edges: ndarray, size: int, random_state=None) → Histogram1d[source]#

Generate a histogram filled with random variates from the underlying distribution at the current parameter values.

Arguments#

edgesnp.ndarray: The bin edges of the histogram.
sizeint, optional: The number of random variates to generate (default 100000).
random_stateint or np.random.Generator, optional: The random seed or generator to use (default None).

Returns#

Histogram1d: A histogram filled with random variates from the distribution.

_format_fit_output(spec: str) → str[source]#

String formatting for fit output.

Arguments#

specstr: The format specification.

Returns#

textstr: The formatted string.

_abc_impl = <_abc._abc_data object>#

class aptapy.modeling.AbstractFitModel(label: str = None, xlabel: str = None, ylabel: str = None)[source]#

Abstract base class for a fit model.

classmethod _parameter_dict() → Dict[str, FitParameter][source]#

Return a dictionary of all the FitParameter objects defined in the class and its base classes.

This is a subtle one, as what we really want, here, is all members of a class (including inherited ones) that are of a specific type (FitParameter), in the order they were defined. All of these thing are instrumental to make the fit model work, so we need to be careful.

Also note the we are looping over the MRO in reverse order, so that we preserve the order of definition of the parameters, even when they are inherited from base classes. If a parameter is re-defined in a derived class, the derived class definition takes precedence, as we are using a dictionary to collect the parameters.

Arguments#

clstype: The class to inspect.

Returns#

param_dictdict: A dictionary mapping parameter names to their FitParameter objects.

quadrature(x1: float, x2: float) → float[source]#

Calculate the integral of the model between x1 and x2 using numerical integration.

Arguments#

x1float: The minimum value of the independent variable to integrate over.
x2float: The maximum value of the independent variable to integrate over.

Returns#

integralfloat: The integral of the model between x1 and x2.

integral(x1: float, x2: float) → float[source]#

Default implementation of the integral of the model between x1 and x2. Subclasses can (and are encouraged to) overload this method with an analytical implementation, when available.

Arguments#

x1float: The minimum value of the independent variable to integrate over.
x2float: The maximum value of the independent variable to integrate over.

Returns#

integralfloat: The integral of the model between x1 and x2.

_abc_impl = <_abc._abc_data object>#

class aptapy.modeling.AbstractSigmoidFitModel(label: str = None, xlabel: str = None, ylabel: str = None)[source]#

Abstract base class for fit models representing sigmoids.

amplitude = FitParameter(value=1.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#

location = FitParameter(value=0.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#

scale = FitParameter(value=1.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#

abstractmethod static shape(z: float | ndarray, *parameter_values: float) → float | ndarray[source]#

Abstract method for the normalized shape of the sigmoid model. Subclasses must implement this method.

Arguments#

zarray_like: The normalized independent variable.
parameter_valuesfloat: Additional shape parameters for the sigmoid.

Returns#

array_like: The value of the sigmoid shape function at z.

evaluate(x: float | ndarray, amplitude: float, location: float, scale: float, *parameter_values: float) → float | ndarray[source]#

Overloaded method for evaluating the model.

Note if the scale is negative, we take the complement of the sigmoid function.

init_parameters(xdata: float | ndarray, ydata: float | ndarray, sigma: float | ndarray = 1.0)[source]#: Overloaded method.

default_plotting_range() → Tuple[float, float][source]#

Overloaded method.

By default the plotting range is set to be an interval centered on the location parameter, and extending for a number of scale units on each side.

_abc_impl = <_abc._abc_data object>#

class aptapy.modeling.AbstractCRVFitModel(label: str = None, xlabel: str = None, ylabel: str = None)[source]#

Abstract base class for fit models based on continuous random variables.

(Typically we will use this, in conjunction with the wrap_rv_continuous decorator, to wrap continuous random variables from scipy.stats).

The general rule for the signature of scipy distributions is that they accept all the shape parameters first, and then loc and scale. This decorator creates a fit model class with the appropriate methods to Read dist.shapes (and numargs) to know the positional shape args. Assume loc and scale keywords are always supported.

amplitude = FitParameter(value=1.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#

location = FitParameter(value=0.0, _name=None, error=None, _frozen=False, minimum=-inf, maximum=inf)#

scale = FitParameter(value=1.0, _name=None, error=None, _frozen=False, minimum=0, maximum=inf)#

_rv = None#

classmethod evaluate(x, amplitude, location, scale, *args)[source]#

Overloaded method for evaluating the model.

This takes the pdf of the underlying distribution and scales it by the amplitude.

classmethod primitive(x, amplitude, location, scale, *args)[source]#

Overloaded method for evaluating the primitive of the model.

Note this is not just a primitive, it is the actual cumulative distribution function (cdf) scaled by the amplitude. We keep the primitive() name for because in general not all the fit models are normalizable, and still we want to keep a common interface.

support()[source]#: Return the support of the underlying distribution at the current parameter values.

ppf(p: float | ndarray)[source]#

Return the percent point function (inverse of cdf) of the underlying distribution for a given quantile at the current parameter values.

Arguments#

parray_like: The quantile(s) to evaluate the ppf at.

median()[source]#: Return the median of the underlying distribution at the current parameter values.

mean()[source]#: Return the mean of the underlying distribution at the current parameter values.

std()[source]#: Return the standard deviation of the underlying distribution at the current parameter values.

rvs(size: int = 1, random_state=None)[source]#

Generate random variates from the underlying distribution at the current parameter values.

Arguments#

sizeint, optional: The number of random variates to generate (default 1).
random_stateint or np.random.Generator, optional: The random seed or generator to use (default None).

init_parameters(xdata: float | ndarray, ydata: float | ndarray, sigma: float | ndarray = 1.0) → None[source]#

Overloaded method.

This is tailored on unimodal distributions, where we start from the basic statistics (average, standard deviation and area) of the input sample and try to match the amplitude, location and scale of the distribution to be fitted. No attempt is made at setting the shape parameters (if any).

default_plotting_range() → Tuple[float, float][source]#

Overloaded method.

Note we have access to all the goodies of a scipy.stats.rv_continuous object here (e.g., the support of the function, and the mean and standard deviation when they are finite), so we can be fairly clever in setting up a generic method that works out of the box in many cases.

plot(axes: Axes = None, fit_output: bool = False, plot_mean: bool = True, **kwargs) → None[source]#

Plot the model.

Note this is reimplemented from scratch to allow overplotting the mean of the distribution.

Arguments#

axesmatplotlib.axes.Axes, optional: The axes to plot on (default: current axes).
fit_outputbool, optional: Whether to include the fit output in the legend (default: False).
plot_meanbool, optional: Whether to overplot the mean of the distribution (default: True).
kwargsdict, optional: Additional keyword arguments passed to plt.plot().

_abc_impl = <_abc._abc_data object>#

class aptapy.modeling.PhonyCRVFitModel(scipy_version: str)[source]#: Phony class to provide a mechanism not to break everything when a particular scipy.stats distribution is not available in a given scipy version.

aptapy.modeling.wrap_rv_continuous(rv, **shape_parameters) → type[source]#

Decorator to wrap a scipy.stats.rv_continuous object into a fit model.

This is fairly minimal, and basically accounts to adding all the necessary shape parameters to the underlying fit model class. Note the name of the parameters is inferred from the rv.shapes attribute, and each shape parameter is set to 1. by default (with a minimum of 0.) unless this is overridden via the shape_parameters argument.

Arguments#

rvscipy.stats.rv_continuous: The scipy.stats.rv_continuous object to wrap.
shape_parametersdict, optional: Additional shape parameters to be setup with non-default FitParameter objects (e.g., to set different minimum/maximum values).

aptapy.modeling.line_forest(*energies: float) → Callable[[type], type][source]#

Decorator to build a line forest fit model.

A line forest is a collection of spectral lines at known energies, each with an independent amplitude, all sharing a common energy scale and with a line width (sigma) that scales as the square root of the line energy.

This decorator is simply adding a class attribute to store the line energies, and creating all the necessary FitParameter objects.

While the decorator is agnostic as to what is the actual line shape, the GaussianForestBase class is a good example of how to use this decorator to build a line forest fit model.

Arguments#

energiesfloat: The energies of the lines comprised in the forest. (These are typically provided in physical units, e.g., keV, whereas the energy scale parameters determines the conversion between the energy and whatever units the fit model is actually evaluated in. e.g., ADC counts).

class aptapy.modeling.GaussianForestBase(label: str = None, xlabel: str = None, ylabel: str = None)[source]#

Abstract base model representing a forest of Gaussian spectral lines at fixed energies.

Concrete models needs to be decorated with the @line_forest decorator, specifying the energies of the lines included in the forest.

Each peak corresponds to a known energy, and the model allows for fitting the amplitudes, a global energy scale, and a common width (sigma) that scales as the square root of the line energy, as it is common to observe in particle detectors.

evaluate(x: float | ndarray, *parameter_values) → float | ndarray[source]#

Evaluate the model at a given set of parameter values.

Arguments#

xarray_like: The value(s) of the independent variable.
parameter_valuessequence of float: The value of the model parameters.

Returns#

yarray_like: The value(s) of the model at the given value(s) of the independent variable for a given set of parameter values.

freeze(model_function, **constraints) → Callable[source]#: Overloaded method.

_intensities()[source]#: Return the current values of the line intensities for the forest, properly normalized to one.

fwhm()[source]#: Calculate the FWHM of the main line of the forest.

rvs(size: int = 1, random_state=None)[source]#

Generate random variates from the underlying distribution at the current parameter values.

Arguments#

sizeint, optional: The number of random variates to generate (default 1).
random_stateint or np.random.Generator, optional: The random seed or generator to use (default None).

init_parameters(xdata: float | ndarray, ydata: float | ndarray, sigma: float | ndarray = 1.0) → None[source]#: Overloaded method.

fit_iterative(xdata: float | ndarray | Histogram1d, ydata: float | ndarray = None, *, p0: float | ndarray = None, sigma: float | ndarray = None, num_sigma_left: float = 2.0, num_sigma_right: float = 2.0, num_iterations: int = 2, **kwargs) → FitStatus[source]#

Fit iteratively line forest spectrum data within a given number of sigma around the peaks.

This function performs a first round of fit to the data (either a histogram or scatter plot data) and then repeats the fit iteratively, limiting the fit range to a specified interval defined in terms of deviations (in sigma) around the peaks.

Arguments#

xdataarray_like or Histogram1d: The data (scatter plot x values) or histogram to fit.
ydataarray_like, optional: The y data to fit (if xdata is not a Histogram1d).
p0array_like, optional: The initial values for the fit parameters.
sigmaarray_like, optional: The uncertainties on the y data.
num_sigma_leftfloat: The number of sigma on the left of the first peak to be used to define the fitting range.
num_sigma_rightfloat: The number of sigma on the right of the last peak to be used to define the fitting range.
num_iterationsint: The number of iterations of the fit.
kwargsdict, optional: Additional keyword arguments passed to fit().

Returns#

FitStatus: The results of the fit.

default_plotting_range() → Tuple[float, float][source]#: Overloaded method.

plot(axes: Axes = None, fit_output: bool = False, plot_components: bool = True, **kwargs) → Axes[source]#

Overloaded method for plotting the model.

Arguments#

axesmatplotlib.axes.Axes, optional: The axes on which to plot the model. If None, uses the current axes.
fit_outputbool, optional: If True, displays the fit output on the legend. Default is False.
plot_componentsbool, optional: If True, plots the individual components of the model as dashed lines. Default is True.
kwargs: Additional keyword arguments passed to the parent class.

Returns#

None

_abc_impl = <_abc._abc_data object>#

class aptapy.modeling.FitModelSum(*components: AbstractFitModel)[source]#

Composite model representing the sum of an arbitrary number of simple models.

Arguments#

componentssequence of AbstractFitModel: The components of the composite model.

name() → str[source]#: Return the model name.

freeze(model_function, **constraints) → Callable[source]#

Overloaded method.

This is a tricky one, for two distinct reasons: (i) for a FitModelSum object evaluate() is not a static method, as it needs to access the list of components to sum over; (ii) since components can be added at runtime, the original signature of the function is generic, so we need to build a new signature that reflects the actual parameters of the model when we actually want to use it in a fit. In order to make this work, when freezing parameters we build a wrapper around evaluate() with the correct signature, and pass it downstream to the static freeze() method of the parent class AbstractFitModel.

evaluate(x: float | ndarray, *parameter_values) → float | ndarray[source]#

Overloaded method for evaluating the model.

Note this is not a static method, as we need to access the list of components to sum over.

integral(x1: float, x2: float) → float[source]#

Calculate the integral of the model between x1 and x2.

This is implemented as the sum of the integrals of the components.

Arguments#

x1float: The minimum value of the independent variable to integrate over.
x2float: The maximum value of the independent variable to integrate over.

Returns#

integralfloat: The integral of the model between x1 and x2.

plot(axes: Axes = None, fit_output: bool = False, plot_components: bool = True, **kwargs) → Axes[source]#

Overloaded method for plotting the model.

Arguments#

axesmatplotlib.axes.Axes, optional: The axes on which to plot the model. If None, uses the current axes.
fit_outputbool, optional: If True, displays the fit output on the legend. Default is False.
plot_componentsbool, optional: If True, plots the individual components of the model as dashed lines. Default is True.
kwargs: Additional keyword arguments passed to the parent class.

Returns#

None

_format_fit_output(spec: str) → str[source]#

String formatting for fit output.

Arguments#

specstr: The format specification.

Returns#

textstr: The formatted string.

_abc_impl = <_abc._abc_data object>#

modeling — Modeling#

Parameters#

Fit status#

Simple models#

Fitting primer#

Location-scale models#

Sigmoid models#

Gaussian forests#

Composite models#

Module documentation#

Returns#

Returns#

Returns#

Arguments#

Returns#

Arguments#

Arguments#

Arguments#

Returns#

Arguments#

Returns#

Raises#

Arguments#

Returns#

Returns#

Arguments#

Arguments#

Arguments#

Returns#

Arguments#

Returns#

Returns#

Arguments#

Returns#

Returns#

Returns#

Returns#

Arguments#

Arguments#

Arguments#

Returns#

Arguments#

Returns#

Arguments#

Returns#

Returns#

Arguments#

Returns#

Returns#

Arguments#

Arguments#

Arguments#

Returns#

Arguments#

Returns#

Arguments#

Returns#

Arguments#

Arguments#

Returns#

Arguments#

Returns#

Arguments#

Returns#

Arguments#

Returns#

Arguments#

Returns#

Arguments#

Returns#

Arguments#

Arguments#

Arguments#

Arguments#

Arguments#

Arguments#

Returns#

Arguments#

Arguments#

Returns#

`modeling` — Modeling#