spac.phenotyping.decode_phenotype(data, phenotype_code, **kwargs)[source]

Convert a phenotype code into a dictionary mapping feature (marker) names to values for that marker’s classification as ‘+’ or ‘-‘.

Parameters:
  • data (pandas.DataFrame) – The DataFrame containing the columns that will be used to decode the phenotype.

  • phenotype_code (str) – The phenotype code string, which should end with ‘+’ or ‘-‘.

  • **kwargs (keyword arguments) –

    Optional keyword arguments to specify prefix and suffix to be added to the column names. - prefix : str, optional

    Prefix to be added to the column names for the feature classification. Default is ‘’.

    • suffixstr, optional

      Suffix to be added to the column names for the feature classification. Default is ‘’.

Returns:

A dictionary where the keys are column names and the values are the corresponding phenotype classification.

Return type:

dict

Raises:

ValueError – If the phenotype code does not end with ‘+’ or ‘-’ or if any columns specified in the phenotype code do not exist in the DataFrame.

Notes

The function splits the phenotype code on ‘+’ and ‘-’ characters to determine the phenotype columns and values. It checks if the columns exist in the DataFrame and whether they are binary or string types to properly map values.