phenotypic.analysis.LogGrowthModel#

class phenotypic.analysis.LogGrowthModel(on: str, groupby: List[str], time_label: str = 'Metadata_Time', agg_func: Callable | str | list | dict | None = 'mean', lam=1.2, alpha=2, Kmax_label: str | None = None, loss: Literal['linear'] = 'linear', verbose: bool = False, n_jobs: int = 1)[source]#

Bases: ModelFitter

Represents a log growth model fitter.

This class defines methods and attributes to configure and fit logarithmic growth models to grouped data. It provides functionality for analyzing and visualizing the fitted models as well as exposing the results for further processing.

Logistic Kinetics Model:

\[N(t) = \frac{K}{1 + \frac{K - N_0}{N_0} e^{-rt}}\]

\(N_t\): population size at time \(t\)

\(N_0\): initial population size at time \(t\)

\(r\): growth rate

\(K\): carrying capacity (maximum population size)

From this we derive:

\[\mu_{\max} = \frac{K r}{4}\]

\(\mu_{\max}\): maximum specific growth rate

Loss Function:

To solve for the parameters, we use the following loss function with the SciPy linear least-squares solver:

\[J(K, N_0, r) = \frac{1}{n}\sum_{i=1}^{n} \frac{1}{2}\left(f_{K,N_0,r}(t^{(i)}) - N_t^{(i)}\right)^2 + \lambda\left(\left(\frac{dN}{dt}\right)^2 + N_0^2\right) + \alpha \frac{\lvert K - \max(N_t) \rvert}{N_t}\]

\(\lambda\): regularization term for growth rate and initial population size

\(\alpha\): penalty term for deviations in carrying capacity relative to

the largest measurement

Parameters:
lam#

The penalty factor applied to growth rates.

Type:

float

alpha#

The maximum penalty factor applied to the carrying capacity.

Type:

float

loss#

The loss calculation method used for fitting.

Type:

Literal[“linear”]

verbose#

A flag to enable or disable detailed logging.

Type:

bool

time_label#

The column name representing the time dimension in the input data.

Type:

str

Kmax_label#

The column name for the maximum carrying capacity values, if provided.

Type:

str | None

Methods

__init__

This class initializes parameters for a data processing or modeling procedure.

analyze

model_func

Computes the value of the logistic growth model for a given time point or array of time points and parameters.

results

show

Visualizes model predictions alongside measurements, allowing optional filtering by specified criteria and plotting configuration.

__init__(on: str, groupby: List[str], time_label: str = 'Metadata_Time', agg_func: Callable | str | list | dict | None = 'mean', lam=1.2, alpha=2, Kmax_label: str | None = None, loss: Literal['linear'] = 'linear', verbose: bool = False, n_jobs: int = 1)[source]#

This class initializes parameters for a data processing or modeling procedure. It takes configuration arguments for handling data grouping, time management, aggregation, penalties, loss calculation, and verbosity.

Parameters:
  • on (str) – The target variable or column to process.

  • groupby (List[str]) – The columns that define the grouping structure.

  • time_label (str) – Column name that represents time in the data. Defaults to ‘Metadata_Time’.

  • agg_func (Callable | str | list | dict | None) –

    Aggregation function(s) to apply to grouped data. Parameter is fed to

    pandas.DataFrame.groupby.agg(). Defaults to ‘mean’.

  • lam – The penalty factor applied to growth rates. Defaults to 1.2.

  • alpha – The maximum penalty factor applied to the carrying capacity. Defaults to 2.

  • Kmax_label (str | None) – Column name that provides maximum K value for processing. Defaults to None.

  • loss (Literal["linear"]) – Loss calculation method to apply. Defaults to “linear”.

  • verbose (bool) – If True, enables detailed logging for process execution. Defaults to False.

  • n_jobs (int) – Number of parallel jobs to execute. Defaults to 1.

analyze(data: DataFrame) DataFrame[source]#
Parameters:

data (DataFrame)

Return type:

DataFrame

show(tmax: int | float | None = None, criteria: Dict[str, Any | List[Any]] | None = None, figsize=(6, 4), cmap: str | None = 'tab20', legend=True, ax: Axes | None = None, **kwargs) Tuple[Figure, Axes][source]#

Visualizes model predictions alongside measurements, allowing optional filtering by specified criteria and plotting configuration.

Parameters:
  • tmax (int | float | None, optional) – The maximum time value for plotting. If set to None, the maximum time value will be determined from the data automatically.

  • criteria (Dict[str, Union[Any, List[Any]]] | None, optional) – A dictionary specifying filtering criteria for data selection. When provided, only data matching the criteria will be used for plotting.

  • figsize (tuple, optional) – A tuple specifying the size of the figure. Defaults to (6, 4).

  • cmap (str | None, optional) – A string representing either a matplotlib colormap name or a single color (e.g., ‘red’, ‘#FF0000’). If a matplotlib colormap is provided, colors will be cycled through it. If a single color is provided, all lines will use that color. Defaults to ‘tab20’.

  • legend (bool, optional) – A boolean that controls whether a legend is displayed on the plot. Defaults to True.

  • ax (plt.Axes, optional) – A matplotlib Axes object on which to plot. If not provided, a new figure and axes object will be created.

  • **kwargs – Additional matplotlib parameters to customize the plot. Common options include: - dpi: Figure resolution (default 100) - facecolor: Figure background color - edgecolor: Figure edge color - line_width: Line width for prediction lines - marker_size: Size of data point markers - elinewidth: Error bar line width - capsize: Error bar cap size - title: Custom figure title - xlabel: Custom x-axis label - ylabel: Custom y-axis label - legend_loc: Legend location (default ‘best’) - legend_fontsize: Font size for legend

Returns:

A tuple containing the matplotlib Figure and

Axes objects used for plotting.

Return type:

Tuple[plt.Figure, plt.Axes]

Raises:

KeyError – If the group keys for model results and measurements do not align, or if specified columns are missing from the input data.

results() DataFrame[source]#
Return type:

DataFrame

static model_func(t: ndarray[float] | float, r: float, K: float, N0: float)[source]#

Computes the value of the logistic growth model for a given time point or array of time points and parameters. The logistic model describes growth that initially increases exponentially but levels off as the population reaches a carrying capacity.

This static method uses the formula:

N(t) = K / (1 + [(K - N0) / N0] * exp(-r * t))

Where:

t: Time (independent variable, can be scalar or array). r: Growth rate. K: Carrying capacity (maximum population size). N0: Initial population size.

Parameters:
  • t (np.ndarray[float] | float) – Time at which the population is calculated. Can be a single value or an array of values.

  • r (float) – Growth rate of the population.

  • K (float) – Carrying capacity or the maximum population size.

  • N0 (float) – Initial population size at time t=0.

Returns:

The computed population size at the given time or array of times based on the logistic growth model.

Return type:

float | np.ndarray[float]