spac.spatial_analysis module

spac.spatial_analysis.calculate_nearest_neighbor(adata, annotation, spatial_associated_table='spatial', imageid=None, label='spatial_distance', verbose=True)[source]

Computes the shortest distance from each cell to the nearest cell of each phenotype (via scimap.tl.spatial_distance) and stores the resulting DataFrame in adata.obsm[label].

Parameters:
  • adata (anndata.AnnData) – Annotated data matrix with spatial information.

  • annotation (str) – Column name in adata.obs containing cell annotationsi (i.e. phenotypes).

  • spatial_associated_table (str, optional) – Key in adata.obsm where spatial coordinates are stored. Default is ‘spatial’.

  • imageid (str, optional) – The column in adata.obs specifying image IDs. If None, a dummy image column is created temporarily. Spatial distances are computed across the entire dataseti as if it’s one image.

  • label (str, optional) – The key under which results are stored in adata.obsm. Default is ‘spatial_distance’.

  • verbose (bool, optional) – If True, prints progress messages. Default is True.

Returns:

Modifies adata in place by storing a DataFrame of spatial distances in adata.obsm[label].

Return type:

None

Example

For a dataset with two cells (CellA, CellB) both of the same phenotype “type1”, the output might look like:

>>> adata.obsm['spatial_distance']
       type1
CellA    0.0
CellB    0.0

For a dataset with two phenotypes “type1” and “type2”, the output might look like:

>>> adata.obsm['spatial_distance']
       type1     type2
CellA   0.00  1.414214
CellB  1.414214  0.00
Input:
adata.obs:

cell_type imageid type1 image1 type1 image1 type2 image1

adata.obsm[‘spatial’]:

[[0.0, 0.0], [1.0, 1.0], [2.0, 2.0]]

Output stored in adata.obsm[‘spatial_distance’]:

type1 type2

0 0.0 1.414 1 1.414 0.0 2 2.236 1.0

Raises:

ValueError – If spatial_associated_table is not found in adata.obsm. If spatial coordinates are missing or invalid.

spac.spatial_analysis.neighborhood_profile(adata, phenotypes, distances, regions=None, spatial_key='spatial', normalize=None, associated_table_name='neighborhood_profile')[source]

Calculate the neighborhood profile for every cell in all slides in an analysis and update the input AnnData object in place.

Parameters:
  • adata (AnnData) – The AnnData object containing the spatial coordinates and phenotypes.

  • phenotypes (str) – The name of the column in adata.obs that contains the phenotypes.

  • distances (list) – The list of increasing distances for the neighborhood profile.

  • spatial_key (str, optional) – The key in adata.obs that contains the spatial coordinates. Default is ‘spatial’.

  • normalize (str or None, optional) – If ‘total_cells’, normalize the neighborhood profile based on the total number of cells in each bin. If ‘bin_area’, normalize the neighborhood profile based on the area of every bin. Default is None.

  • associated_table_name (str, optional) – The name of the column in adata.obsm that will contain the neighborhood profile. Default is ‘neighborhood_profile’.

  • regions (str or None, optional) – The name of the column in adata.obs that contains the regions. If None, all cells in adata will be used. Default is None.

Returns:

The function modifies the input AnnData object in place, adding a new column containing the neighborhood profile to adata.obsm.

Return type:

None

Notes

The input AnnData object ‘adata’ is modified in place. The function adds a new column containing the neighborhood profile to adata.obsm, named by the parameter ‘associated_table_name’. The associated_table_name is a 3D array of shape (n_cells, n_phenotypes, n_bins) where n_cells is the number of cells in the all slides, n_phenotypes is the number of unique phenotypes, and n_bins is the number of bins in the distances list.

A dictionary is added to adata.uns[associated_table_name] with the two keys “bins” and “labels”. “labels” will store all the values in the phenotype annotation.

spac.spatial_analysis.ripley_l(adata, annotation, phenotypes, distances, regions=None, spatial_key='spatial', n_simulations=1, area=None, seed=42, edge_correction=True)[source]

Calculate Ripley’s L statistic for spatial data in adata.

Ripley’s L statistic is a spatial point pattern analysis metric that quantifies clustering or regularity in point patterns across various distances. This function calculates the statistic for each region in adata (if provided) or for all cells if regions are not specified.

Parameters:
  • adata (AnnData) – The annotated data matrix containing the spatial coordinates and cell phenotypes.

  • annotation (str) – The key in adata.obs representing the annotation for cell phenotypes.

  • phenotypes (list of str) – A list containing two phenotypes for which the Ripley L statistic will be calculated. If the two phenotypes are the same, the calculation is done for the same type; if different, it considers interactions between the two.

  • distances (array-like) – An array of distances at which to calculate Ripley’s L statistic. The values must be positive and incremental.

  • regions (str or None, optional) – The key in adata.obs representing regions for stratifying the data. If None, all cells will be treated as one region.

  • spatial_key (str, optional) – The key in adata.obsm representing the spatial coordinates. Default is “spatial”.

  • n_simulations (int, optional) – Number of simulations to perform for significance testing. Default is 100.

  • area (float or None, optional) – The area of the spatial region of interest. If None, the area will be inferred from the data. Default is None.

  • seed (int, optional) – Random seed for simulation reproducibility. Default is 42.

  • edge_correction (boo, optional) – If True, apply edge correction to the Ripley’s L calculation.

Returns:

A DataFrame containing the Ripley’s L results for each region or the entire dataset if regions is None. The DataFrame includes the following columns: - region: The region label or ‘all’ if no regions are specified. - center_phenotype: The first phenotype in phenotypes. - neighbor_phenotype: The second phenotype in phenotypes. - ripley_l: The Ripley’s L statistic calculated for the region. - config: A dictionary with configuration settings used for the calculation.

Return type:

pd.DataFrame

Notes

Ripley’s L is an adjusted version of Ripley’s K that corrects for the inherent increase in point-to-point distances as the distance grows. This statistic is used to evaluate spatial clustering or dispersion of points (cells) in biological datasets.

The function uses pre-defined distances and performs simulations to assess the significance of observed patterns. The results are stored in the .uns attribute of adata under the key ‘ripley_l’, or in a new DataFrame if no prior results exist.

Examples

Calculate Ripley’s L for two phenotypes in a single region dataset:

>>> result = ripley_l(adata, annotation='cell_type', phenotypes=['A', 'B'], distances=np.linspace(0, 500, 100))

Calculate Ripley’s L for multiple regions in adata:

>>> result = ripley_l(adata, annotation='cell_type', phenotypes=['A', 'B'], distances=np.linspace(0, 500, 100), regions='region_key')
spac.spatial_analysis.spatial_interaction(adata, annotation, analysis_method, stratify_by=None, ax=None, return_matrix=False, seed=None, coord_type=None, n_rings=1, n_neighs=6, radius=None, cmap='seismic', **kwargs)[source]

Perform spatial analysis on the selected annotation in the dataset. Current analysis methods are provided in squidpy:

Neighborhood Enrichment, Cluster Interaction Matrix

Parameters:
  • adata (anndata.AnnData) – The AnnData object.

  • annotation (str) – The column name of the annotation (e.g., phenotypes) to analyze in the provided dataset.

  • analysis_method (str) – The analysis method to use, currently available: “Neighborhood Enrichment” and “Cluster Interaction Matrix”.

  • stratify_by (str or list of strs) – The annotation[s] to stratify the dataset when generating interaction plots. If single annotation is passed, the dataset will be stratified by the unique labels in the annotation column. If n (n>=2) annotations are passed, the function will be stratified based on existing combination of labels in the passed annotations.

  • ax (matplotlib.axes.Axes, default None) – The matplotlib Axes to display the image. This option is only available when stratify is None.

  • return_matrix (boolean, default False) – If true, the fucntion will return a list of two dictionaries, the first contains axes and the second containing computed matrix. Note that for Neighborhood Encrichment, the matrix will be a tuple with the z-score and the enrichment count. For Cluster Interaction Matrix, it will returns the interaction matrix. If False, the function will return only the axes dictionary.

  • seed (int, default None) – Random seed for reproducibility, used in Neighborhood Enrichment Analysis.

  • coord_type (str, optional) – Type of coordinate system used in sq.gr.spatial_neighbors. Should be either ‘grid’ (Visium Data) or ‘generic’ (Others). Default is None, decided by the squidy pacakge. If spatial_key is in anndata.uns the coord_type would be ‘grid’, otherwise general.

  • n_rings (int, default 1) – Number of rings of neighbors for grid data. Only used when coord_type = ‘grid’ (Visium)

  • n_neights (int, optional) – Default is 6. Depending on the coord_type: - ‘grid’ (Visium) - number of neighboring tiles. - ‘generic’ - number of neighborhoods for non-grid data.

  • radius (float, optional) –

    Default is None. Only available when coord_type = ‘generic’. Depending on the type: - float - compute the graph based on neighborhood radius. - tuple - prune the final graph to only contain

    edges in interval [min(radius), max(radius)].

  • cmap (str, default 'seismic') – The colormap to use for the plot. The ‘seismic’ color map consist of three color regions: red for positive, blue for negative, and the white at the center. This color map effectively represents the nature of the spatial interaction analysis results, where positive values indicate clustering and negative values indicate seperation. For more color maps, please visit https://matplotlib.org/stable/tutorials/colors/colormaps.html

  • **kwargs – Keyword arguments for matplotlib.pyplot.text()

Returns:

A dictionary containing the results of the spatial interaction analysis. The keys of the dictionary depend on the parameters passed to the function:

Axdict or matplotlib.axes.Axes

If stratify_by is not used, returns a single matplotlib.axes.Axes object. If stratify_by is used, returns a dictionary of Axes objects, with keys representing the stratification groups.

Matrixdict, optional

Contains processed DataFrames of computed matrices with row and column labels applied. If stratify_by is used, the keys represent the stratification groups. For example: - results[‘Matrix’][‘GroupA’] for a specific stratification group. - If stratify_by is not used, the table is accessible via results[‘Matrix’][‘annotation’].

Return type:

dict

Functions