Data.DrRead

Click here to view source code.

class DrRead:

It contains functions DrRead.PairCSV, DrRead.PairDef, DrRead.FeatCell, and DrRead.FeatDrug. They are respectively used for reading the response data from csv file, reading the response data integrated in the library, preparing the cell feature, and preparing the drug feature.

DrRead.PairCSV

def PairCSV(csv_path: str):

It can be used to read the response data from csv file. First you need to follow the tutorial to prepare the required response data, click here for details.

PARAMETERS:

  • csv_path (str) - The path of the response data. It is required to end in ".csv".

OUTPUTS:

  • pair_ls (list) - The cell-drug pairs. Each element in the list is a sub-list that contains three elements, which are the cell name, drug name, and drug response.

DrRead.PairDef

def PairDef(dataset: str,
            response: str):

It can be used to read the response data integrated in the library.

For dataset (str) and response (str), the following settings are available:

Parameter setting

Corresponding file

dataset=”CCLE”, response=”ActArea”

CCLE_ActArea.csv

dataset=”CCLE”, response=”IC50”

CCLE_IC50.csv

dataset=”GDSC1”, response=”AUC”

GDSC1_AUC.csv

dataset=”GDSC1”, response=”IC50”

GDSC1_IC50.csv

dataset=”GDSC2”, response=”AUC”

GDSC2_AUC.csv

dataset=”GDSC2”, response=”IC50”

GDSC2_IC50.csv

PARAMETERS:

  • dataset (str) - "CCLE", "GDSC1", or "GDSC2".

  • response (str) - "ActArea", "AUC", or "IC50".

OUTPUTS:

  • pair_ls (list) - The cell-drug pairs. Each element in the list is a sub-list that contains three elements, which are the cell name, drug name, and drug response.

DrRead.FeatCell

def FeatCell(csv_path: str,
             subset: bool,
             subset_path: str = None,
             save_feat_path: str = None,
             save_gene_path: str = None):

It can be used to prepare the cell feature. Each cell feature will be z-score standardized. First you need to follow the tutorials to prepare the required gene subset and cell data, click here for details.

Note

If there are nan values or missing genes in the cell data, the average of the non-nan values of the cell data will be filled in.

If you want to get genome-wide feature, use FeatCell(csv_path="feat.example.csv", subset=False).

If you want to use the default gene subset (containing 6,163 genes) to screen for cell feature, use FeatCell(csv_path="feat.example.csv", subset=True).

PARAMETERS:

  • csv_path (str) - The path of the cell data. It is required to end in ".csv".

  • subset (bool) - Whether to use the gene subset.

  • subset_path (str, optional) - The path of the gene subset. It is required to end in ".txt". If it is set to None, the default path will be used. (default: None)

  • save_feat_path (str, optional) - Save path for CellFeat. It is required to end in ".pkl". If it is set to None, the default path will be used. (default: None)

  • save_gene_path (str, optional) - Save path for GeneList. It is required to end in ".pkl". If it is set to None, the default path will be used. (default: None)

OUTPUTS:

  • CellFeat (dict) - The key is the cell name and the value is the z-score standardized cell feature.

  • GeneList (list) - Each element is a gene name, which corresponds to the cell feature.

DrRead.FeatDrug

def FeatDrug(csv_path: str,
             MPG_path: str,
             save_SMILES_path: str = None,
             save_MPG_path: str = None):

It can be used to prepare the drug feature. First you need to follow the tutorials to prepare the required response data, click here for details.

If you don’t need MPG feature, use FeatDrug(csv_path="pair.example.csv", MPG_path=None).

If you want to get MPG feature, use FeatDrug(csv_path="pair.example.csv", MPG_path="MolGNet.pt"). Click here to download MolGNet.pt.

PARAMETERS:

  • csv_path (str) - The path of the response data. It is required to end in ".csv".

  • MPG_path (str) - The path of the gene subset. It is required to end in ".pt". If it is set to None, MPG_dict will be None.

  • save_SMILES_path (str, optional) - Save path for SMILES_dict. It is required to end in ".pkl". If it is set to None, the default path will be used. (default: None)

  • save_MPG_path (str, optional) - Save path for MPG_dict. It is required to end in ".pkl". If it is set to None, the default path will be used. (default: None)

OUTPUTS:

  • SMILES_dict (dict) - The key is the drug name and the value is the SMILES string.

  • MPG_dict (dict) - The key is the drug name and the value is the MPG feature. The value is None if MPG_path is set to None.