- spac.utag_functions.utag(adata, channels_to_use=None, slide_key='Slide', save_key: str = 'UTAG Label', filter_by_variance: bool = False, max_dist: float = 20.0, normalization_mode: str = 'l1_norm', keep_spatial_connectivity: bool = False, n_pcs=10, apply_umap: bool = False, umap_kwargs: Dict[str, Any] = {}, apply_clustering: bool = True, clustering_method: Sequence[str] = ['leiden'], resolutions: Sequence[float] = [0.05, 0.1, 0.3, 1.0], leiden_kwargs: Dict[str, Any] | None = None, parc_kwargs: Dict[str, Any] | None = None, parallel: bool = False, processes: int = 1, k=15, random_state=42)[source]
Discover tissue architechture in single-cell imaging data by combining phenotypes and positional information of cells.
- Parameters:
adata (AnnData) – AnnData object with spatial positioning of cells in obsm ‘spatial’ slot.
channels_to_use (Optional[Sequence[str]]) – An optional sequence of strings used to subset variables to use. Default (None) is to use all variables.
max_dist (float) – Maximum distance to cut edges within a graph. Should be adjusted depending on resolution of images. For imaging mass cytometry, where resolution is 1um, 20 often gives good results. Default is 20.
slide_key ({str, None}) – Key of adata.obs containing information on the batch structure of the data. In general, for image data this will often be a variable indicating the image so image-specific effects are removed from data. Default is “Slide”.
save_key (str) – Key to be added to adata object holding the UTAG clusters. Depending on the values of clustering_method and resolutions, the final keys will be of the form: {save_key}_{method}_{resolution}”. Default is “UTAG Label”.
filter_by_variance (bool) – Whether to filter vairiables by variance. Default is False, which keeps all variables.
max_dist – Recommended values are between 20 to 50 depending on magnification. Default is 20.
normalization_mode (str) – Method to normalize adjacency matrix. Default is “l1_norm”, any other value will not use normalization.
keep_spatial_connectivity (bool) – Whether to keep sparse matrices of spatial connectivity and distance in the obsp attribute of the resulting anndata object. This could be useful in downstream applications. Default is not to (False).
n_pcs (Number of principal components to use for clustering. Default is 10.)
None (If)
features. (no PCA is performed and clustering is done on)
apply_umap (bool) – Whether to build a UMAP representation after message passing. Default is False.
umap_kwargs (Dict[str, Any]) – Keyword arguments to be passed to scanpy.tl.umap for dimensionality reduction after message passing. Default is 10.0.
apply_clustering (bool) – Whether to cluster the message passed matrix. Default is True.
clustering_method (Sequence[str]) – Which clustering method(s) to use for clustering of the message passed matrix. Default is [“leiden”].
resolutions (Sequence[float]) – What resolutions should the methods in clustering_method be run at. Default is [0.05, 0.1, 0.3, 1.0].
leiden_kwargs (dict[str, Any]) – Keyword arguments to pass to scanpy.tl.leiden.
parc_kwargs (dict[str, Any]) – Keyword arguments to pass to parc.PARC.
parallel (bool) – Whether to run message passing part of algorithm in parallel. Will accelerate the process but consume more memory. Default is True.
processes (int) – Number of processes to use in parallel. Default is to use all available (-1).
- Returns:
adata – AnnData object with UTAG domain predictions for each cell in adata.obs, column save_key.
- Return type:
AnnData