Analysis#
The analysis module provides functions for analyzing simulation results, including correlation analysis, kernel density estimation (KDE), and mutual information computation.
Correlation Functions#
stochastix.analysis.autocorrelation #
autocorrelation(
results: SimulationResults,
n_points: int = 1000,
species: str | tuple[str, ...] = '*',
) -> tuple[jnp.ndarray, jnp.ndarray]
Compute the autocorrelation of species trajectories.
This function first interpolates the simulation data onto a regular time grid. Then, it calculates the normalized autocorrelation for each specified species' trajectory. The core computation is JIT-compiled for performance.
Parameters:
-
results(SimulationResults) –The
SimulationResultsobject from astochsimsolvesimulation. -
n_points(int, default:1000) –The number of points for the interpolation grid.
-
species(str | tuple[str, ...], default:'*') –The species for which to compute the autocorrelation. Can be "*" for all species, or a string or tuple of strings for specific species.
Returns:
-
tuple[ndarray, ndarray]–Tuple
(lags, autocorrs)where:lags: The time lags for the autocorrelation, in the same units as the simulation time.autocorrs: A 2D array whereautocorrs[:, i]is the autocorrelation of the i-th species.
stochastix.analysis.cross_correlation #
cross_correlation(
results: SimulationResults,
species1: str,
species2: str,
n_points: int = 1000,
) -> tuple[jnp.ndarray, jnp.ndarray]
Compute the cross-correlation between two species trajectories.
This function interpolates the simulation data onto a regular time grid and then computes the normalized cross-correlation between two specified species.
Parameters:
-
results(SimulationResults) –The
SimulationResultsobject from astochsimsolvesimulation. -
species1(str) –The name of the first species.
-
species2(str) –The name of the second species.
-
n_points(int, default:1000) –The number of points for the interpolation grid.
Returns:
-
tuple[ndarray, ndarray]–A tuple
(lags, cross_corr)where:lags: The time lags for the cross-correlation.cross_corr: A 1D array of the cross-correlation values.
Kernel Density Estimation#
stochastix.analysis.kde_triangular #
kde_triangular(
x: ndarray,
n_grid_points: int | None = None,
min_max_vals: tuple[float, float] | None = None,
density: bool = True,
weights: ndarray | None = None,
bw_multiplier: float = 1.0,
*,
dirichlet_alpha: float | None = 0.1,
dirichlet_kappa: float | None = None,
) -> tuple[jnp.ndarray, jnp.ndarray]
Backward-compatible triangular specialization of kde.
stochastix.analysis.kde_exponential #
kde_exponential(
x: ndarray,
n_grid_points: int | None = None,
min_max_vals: tuple[float, float] | None = None,
density: bool = True,
weights: ndarray | None = None,
bw_multiplier: float = 1.0,
*,
dirichlet_alpha: float | None = 0.1,
dirichlet_kappa: float | None = None,
) -> tuple[jnp.ndarray, jnp.ndarray]
Backward-compatible exponential specialization of kde.
stochastix.analysis.kde_gaussian #
kde_gaussian(
x: ndarray,
n_grid_points: int | None = None,
min_max_vals: tuple[float, float] | None = None,
density: bool = True,
weights: ndarray | None = None,
bw_multiplier: float = 1.0,
*,
dirichlet_alpha: float | None = 0.1,
dirichlet_kappa: float | None = None,
) -> tuple[jnp.ndarray, jnp.ndarray]
Backward-compatible Gaussian specialization of kde.
stochastix.analysis.kde_triangular_2d #
kde_triangular_2d(
x1: ndarray,
x2: ndarray,
n_grid_points1: int | None = None,
n_grid_points2: int | None = None,
min_max_vals1: tuple[float, float] | None = None,
min_max_vals2: tuple[float, float] | None = None,
density: bool = True,
weights: ndarray | None = None,
bw_multiplier: float = 1.0,
*,
dirichlet_alpha: float | None = 0.1,
dirichlet_kappa: float | None = None,
) -> tuple[jnp.ndarray, jnp.ndarray, jnp.ndarray]
Backward-compatible triangular specialization of kde_2d.
stochastix.analysis.state_kde #
state_kde(
results: SimulationResults,
species: str | tuple[str, ...],
n_grid_points: int | None = None,
min_max_vals: tuple[float, float] | None = None,
density: bool = True,
t: int | float = -1,
*,
kde_type: str = 'wendland_c2',
bw_multiplier: float = 1.0,
dirichlet_alpha: float | None = 0.1,
dirichlet_kappa: float | None = None,
) -> tuple[jnp.ndarray, jnp.ndarray]
Compute a kernel density estimate (KDE) of the state distribution.
This function is a convenience wrapper around KDE functions that specifically
operates on the state of batched SimulationResults. It supports multiple
kernel types for density estimation.
Parameters:
-
results(SimulationResults) –The
SimulationResultsfrom astochsimsolvesimulation. This should contain a batch of simulation trajectories (e.g. from vmapping overstochsimsolve). -
species(str | tuple[str, ...]) –The species for which to compute the KDE. Can be a single species name or a tuple of names.
-
n_grid_points(int | None, default:None) –Number of grid points to use. If
None, inferred from the integer span[floor(min(x)), ceil(max(x))]. -
min_max_vals(tuple[float, float] | None, default:None) –Tuple
(min_val, max_val)defining the grid range. IfNone, determined from data. -
density(bool, default:True) –If
True, returns a probability density function whose Riemann sum over the grid integrates to 1. IfFalse, returns unnormalized counts/weights per grid point. -
t(int | float, default:-1) –The time point (float) or time index (int) at which to compute the KDE. If
-1(default), uses the final time point. -
kde_type(str, default:'wendland_c2') –Type of kernel to use. One of
'triangular','exponential','gaussian', or'wendland_c2'. Default is'wendland_c2'. -
bw_multiplier(float, default:1.0) –Kernel bandwidth multiplier. Controls the width of the kernel relative to the grid step size. Default is
1.0. -
dirichlet_alpha(float | None, default:0.1) –Per-bin pseudo-count for Dirichlet smoothing. Default is
0.1. Note:dirichlet_kappatakes priority over this parameter if provided. -
dirichlet_kappa(float | None, default:None) –Total pseudo-count for Dirichlet smoothing. If provided, takes priority over
dirichlet_alphaandalpha = kappa / Kwhere K is the number of grid points. IfNone, usesdirichlet_alphainstead.
Returns:
-
tuple[ndarray, ndarray]–A tuple
(grid, values)where:grid: 1D array of evaluation points (grid centers), shape(n_grid_points,).values: 2D array wherevalues[:, i]is the KDE values for the i-th species at the specified time point, shape(n_grid_points, n_species). Ifdensity=True, these approximate a PDF.
Mutual Information#
stochastix.analysis.mutual_information #
mutual_information(
x1: ndarray,
x2: ndarray,
n_grid_points1: int | None = None,
n_grid_points2: int | None = None,
min_max_vals1: tuple[float, float] | None = None,
min_max_vals2: tuple[float, float] | None = None,
base: float = 2.0,
*,
kde_type: str = 'wendland_c2',
bw_multiplier: float = 1.0,
dirichlet_alpha: float | None = 0.1,
dirichlet_kappa: float | None = None,
) -> jnp.ndarray
Compute the mutual information between two arrays.
This function uses KDE functions to compute the mutual information between
two arrays x1 and x2. The mutual information is a measure of the mutual
dependence between the two variables.
JIT-compatibility
For JIT-compatibility, provide concrete values for all grid parameters
(n_grid_points1, n_grid_points2, min_max_vals1, min_max_vals2).
If left as None, grid parameters are determined from the data (not
JIT-able).
Parameters:
-
x1(ndarray) –1D array of the first input data.
-
x2(ndarray) –1D array of the second input data. Must have the same length as
x1. -
n_grid_points1(int | None, default:None) –Number of grid points for the first dimension. If
None, determined automatically. -
n_grid_points2(int | None, default:None) –Number of grid points for the second dimension. If
None, determined automatically. -
min_max_vals1(tuple[float, float] | None, default:None) –Tuple
(min_val, max_val)for the first dimension's grid range. IfNone, determined automatically. -
min_max_vals2(tuple[float, float] | None, default:None) –Tuple
(min_val, max_val)for the second dimension's grid range. IfNone, determined automatically. -
base(float, default:2.0) –The logarithmic base to use for the entropy calculation. Default is
2.0(bits). -
kde_type(str, default:'wendland_c2') –Type of kernel to use. One of
'triangular','exponential','gaussian', or'wendland_c2'. Default is'wendland_c2'. -
bw_multiplier(float, default:1.0) –Kernel bandwidth multiplier. Controls the width of the kernel relative to the grid step size. Default is
1.0. -
dirichlet_alpha(float | None, default:0.1) –Per-bin pseudo-count for Dirichlet smoothing. Default is
0.1. Note:dirichlet_kappatakes priority over this parameter if provided. -
dirichlet_kappa(float | None, default:None) –Total pseudo-count for Dirichlet smoothing. If provided, takes priority over
dirichlet_alpha. IfNone, usesdirichlet_alphainstead.
Returns:
-
ndarray–The mutual information between
x1andx2in the specified base.
stochastix.analysis.state_mutual_info #
state_mutual_info(
results: SimulationResults,
species_at_t: Iterable[tuple[str, int | float]],
n_grid_points1: int | None = None,
n_grid_points2: int | None = None,
min_max_vals1: tuple[float, float] | None = None,
min_max_vals2: tuple[float, float] | None = None,
base: float = 2.0,
*,
kde_type: str = 'wendland_c2',
bw_multiplier: float = 1.0,
dirichlet_alpha: float | None = 0.1,
dirichlet_kappa: float | None = None,
) -> jnp.ndarray
Compute the mutual information between two species at specific time points.
This function calculates the mutual information between the distributions of
two species at two potentially different time points, t1 and t2, from
batched simulation results. It uses KDE functions to ensure the entire
computation is end-to-end differentiable, which is useful for gradient-based
optimization of simulation parameters.
JIT-compatibility
For JIT-compatibility, provide concrete values for all grid parameters
(n_grid_points1, n_grid_points2, min_max_vals1, min_max_vals2).
If left as None, grid parameters are determined from the data (not
JIT-able).
Parameters:
-
results(SimulationResults) –The
SimulationResultsfrom astochsimsolvesimulation. This should contain a batch of simulation trajectories (e.g., from vmapping overstochsimsolve). -
species_at_t(Iterable[tuple[str, int | float]]) –An iterable containing two tuples, where each tuple consists of a species name and a time point, e.g.,
[('S1', t1), ('S2', t2)]. The time point can be an integer index or a float time value. -
n_grid_points1(int | None, default:None) –Number of grid points for the first species. If
None, determined automatically (not JIT-compatible). -
n_grid_points2(int | None, default:None) –Number of grid points for the second species. If
None, determined automatically (not JIT-compatible). -
min_max_vals1(tuple[float, float] | None, default:None) –Tuple
(min_val, max_val)for the first species' grid range. IfNone, determined automatically (not JIT-compatible). -
min_max_vals2(tuple[float, float] | None, default:None) –Tuple
(min_val, max_val)for the second species' grid range. IfNone, determined automatically (not JIT-compatible). -
base(float, default:2.0) –The logarithmic base for the entropy calculation. Default is
2.0(bits). -
kde_type(str, default:'wendland_c2') –Type of kernel to use. One of
'triangular','exponential','gaussian', or'wendland_c2'. Default is'wendland_c2'. -
bw_multiplier(float, default:1.0) –Kernel bandwidth multiplier. Controls the width of the kernel relative to the grid step size. Default is
1.0. -
dirichlet_alpha(float | None, default:0.1) –Per-bin pseudo-count for Dirichlet smoothing. Default is
0.1. Note:dirichlet_kappatakes priority over this parameter if provided. -
dirichlet_kappa(float | None, default:None) –Total pseudo-count for Dirichlet smoothing. If provided, takes priority over
dirichlet_alpha. IfNone, usesdirichlet_alphainstead.
Returns:
-
ndarray–The mutual information between the distributions of the two specified
-
ndarray–species at their respective time points.