Configuration¶
📜 Default config structure¶
{
"global": {
"output_dir": "output",
"verbose": false,
"save": true,
"base_data_dir": "data"
},
"paths": {
"drug_names_file": "drug_names.txt",
"synergy_data_file": "synergy_data.csv",
"node_dict_file": "node_dict.csv",
"tissue_cline_file": "tissue_cline.csv",
"db_file": null,
"manual_chembl_csv": "manual_chembl.csv"
},
"columns": {
"drug_name": "drug_name",
"drug_name_A": "drug_name_A",
"drug_name_B": "drug_name_B",
"conc_A": "conc_A",
"conc_B": "conc_B",
"cell_line": "cell_line",
"synergy": "synergy"
},
"options": {
"synergy_threshold": 0.0,
"double_drug_screen": true,
"original_target_merge": "fill_missing"
}
}
💡 Key points¶
db_file: null→ Uses internal package database (managed by project, not user).base_data_dir— Base path; all relative paths are resolved against it.columns— Customise column names if your data uses different headers.Deep-merge override: Custom config merges recursively with defaults; only specified sections override.
📐 Data format specifications¶
Drug names file¶
Plain text, one drug per line:
Aspirin
Ibuprofen
Paracetamol
Node dictionary (CSV)¶
Gene/protein symbols mapped to logical model node names:
gene,node
EGFR,EGFR_node
TP53,p53
BRAF,BRAF_node
Synergy data (CSV)¶
With concentration data (dual-drug screening):
drug_name_A,drug_name_B,conc_A,conc_B,tissue,cell_line,synergy
Aspirin,Ibuprofen,1.0,2.0,Breast,MCF7,0.15
Without concentration data (single-dose combinations):
drug_name_A,drug_name_B,tissue,cell_line,synergy
Aspirin,Ibuprofen,Breast,MCF7,0.15
Tissue-cell line mapping (CSV)¶
tissue,cell_line
Breast,MCF7
Breast,T47D
Colorectal,HCT116
🔧 Advanced features¶
Manual ChEMBL mapping¶
Provide a CSV to override ChEMBL resolution (skip network queries):
drug_name,ChEMBL_ID
Aspirin,CHEMBL25
Ibuprofen,CHEMBL521
Configure in paths.manual_chembl_csv.
Verbose logging & timing¶
drexpa --config my_config.json --verbose
Output includes:
Step start/end timestamps
Per-step duration (seconds)
Preflight warnings & validation details
Pipeline summary
❓ Troubleshooting¶
Missing required files¶
Error: FileNotFoundError: Preflight validation failed. Missing required files
Fix: Check file paths in config. Ensure base_data_dir is correct.
Missing required columns¶
Error: ValueError: Missing required columns in synergy data
Fix: Verify columns section in config matches your data headers. Use --verbose to see exact missing columns.
ChEMBL resolution fails¶
Error: Network timeout or no results for drug name Fix:
Check drug name spelling (must match ChEMBL exactly or be unambiguous).
Provide manual ChEMBL mapping in
manual_chembl.csvto skip network queries.Run with
--verboseto see which drugs failed.