spac.spatial_analysis.ripley_l(adata, annotation, phenotypes, distances, regions=None, spatial_key='spatial', n_simulations=1, area=None, seed=42)[source]

Calculate Ripley’s L statistic for spatial data in adata.

Ripley’s L statistic is a spatial point pattern analysis metric that quantifies clustering or regularity in point patterns across various distances. This function calculates the statistic for each region in adata (if provided) or for all cells if regions are not specified.

Parameters:
  • adata (AnnData) – The annotated data matrix containing the spatial coordinates and cell phenotypes.

  • annotation (str) – The key in adata.obs representing the annotation for cell phenotypes.

  • phenotypes (list of str) – A list containing two phenotypes for which the Ripley L statistic will be calculated. If the two phenotypes are the same, the calculation is done for the same type; if different, it considers interactions between the two.

  • distances (array-like) – An array of distances at which to calculate Ripley’s L statistic. The values must be positive and incremental.

  • regions (str or None, optional) – The key in adata.obs representing regions for stratifying the data. If None, all cells will be treated as one region.

  • spatial_key (str, optional) – The key in adata.obsm representing the spatial coordinates. Default is “spatial”.

  • n_simulations (int, optional) – Number of simulations to perform for significance testing. Default is 100.

  • area (float or None, optional) – The area of the spatial region of interest. If None, the area will be inferred from the data. Default is None.

  • seed (int, optional) – Random seed for simulation reproducibility. Default is 42.

Returns:

A DataFrame containing the Ripley’s L results for each region or the entire dataset if regions is None. The DataFrame includes the following columns: - region: The region label or ‘all’ if no regions are specified. - center_phenotype: The first phenotype in phenotypes. - neighbor_phenotype: The second phenotype in phenotypes. - ripley_l: The Ripley’s L statistic calculated for the region. - config: A dictionary with configuration settings used for the calculation.

Return type:

pd.DataFrame

Notes

Ripley’s L is an adjusted version of Ripley’s K that corrects for the inherent increase in point-to-point distances as the distance grows. This statistic is used to evaluate spatial clustering or dispersion of points (cells) in biological datasets.

The function uses pre-defined distances and performs simulations to assess the significance of observed patterns. The results are stored in the .uns attribute of adata under the key ‘ripley_l’, or in a new DataFrame if no prior results exist.

Examples

Calculate Ripley’s L for two phenotypes in a single region dataset:

>>> result = ripley_l(adata, annotation='cell_type', phenotypes=['A', 'B'], distances=np.linspace(0, 500, 100))

Calculate Ripley’s L for multiple regions in adata:

>>> result = ripley_l(adata, annotation='cell_type', phenotypes=['A', 'B'], distances=np.linspace(0, 500, 100), regions='region_key')