<no title>

spac.phenotyping.assign_manual_phenotypes(data_df, phenotypes_df, annotation='manual_phenotype', prefix='', suffix='', multiple=True, drop_binary_code=True)[source]

Assign manual phenotypes to the DataFrame and generate summaries.

Parameters:

data_df (pandas.DataFrame) – The DataFrame to which manual phenotypes will be assigned.
phenotypes_df (pandas.DataFrame) –
A DataFrame containing phenotype definitions with columns: - “phenotype_name” : str

The name of the phenotype.
- ”phenotype_code”str
  The code used to decode the phenotype.
annotation (str, optional) – The name of the column to store the combined phenotype. Default is “manual_phenotype”.
prefix (str, optional) – Prefix to be added to the column names. Default is ‘’.
suffix (str, optional) – Suffix to be added to the column names. Default is ‘’.
multiple (bool, optional) – Whether to concatenate the names of multiple positive phenotypes. Default is True.
drop_binary_code (bool, optional) – Whether to drop the binary phenotype columns. Default is True.

Returns:

A dictionary with the following keys: - “phenotypes_counts”: dict

Counts of cells matching each defined phenotype.

”assigned_phenotype_counts”: dict
Counts of cells matching different numbers of phenotypes.
”multiple_phenotypes_summary”: pandas.DataFrame
Summary of cells with multiple phenotypes.

Return type:

dict

Notes

The function generates a combined phenotype column, prints summaries of cells matching multiple phenotypes, and returns a dictionary with detailed counts and summaries.

Examples

Suppose data_df is a DataFrame with binary phenotype columns and phenotypes_df contains the following definitions:

>>> data_df = pd.DataFrame({
...     'cd4_phenotype': [0, 1, 0, 1],
...     'cd8_phenotype': [0, 0, 1, 1]
... })
>>> phenotypes_df = pd.DataFrame([
...     {"phenotype_name": "cd4_cells", "phenotype_code": "cd4+"},
...     {"phenotype_name": "cd8_cells", "phenotype_code": "cd8+"},
...     {"phenotype_name": "cd4_cd8", "phenotype_code": "cd4+cd8+"}
... ])
>>> result = assign_manual_phenotypes(
...     data_df,
...     phenotypes_df,
...     annotation="manual",
...     prefix='',
...     suffix='_phenotype',
...     multiple=True
... )

The data_df DataFrame will be edited in place to include a new column “manual” with the combined phenotype labels:

>>> print(data_df)
   cd4_phenotype  cd8_phenotype manual
0              0              0 no_label
1              1              0 cd4_cells
2              0              1 cd8_cells
3              1              1 cd8_cells, cd4_cd8

The result dictionary contains counts and summaries as follows:

>>> print(result["phenotypes_counts"])
{'cd4_cells': 1, 'cd8_cells': 2, 'cd4_cd8': 1}

>>> print(result["assigned_phenotype_counts"])
0    1
1    2
2    1
Name: num_phenotypes, dtype: int64

>>> print(result["multiple_phenotypes_summary"])
               manual  count
0  cd8_cells, cd4_cd8      1