Reproduce Data & Simulation Results
ROC and PR curves, Fitness Evolution2
- Install the druglogics-synergy module and use the version
1.2.1
:git checkout v1.2.1
(or use directly the released v1.2.1 package) - Run the script
run_druglogics_synergy.sh
in the above repo.
You can of course change several other parameters in the input files or the script itself (e.g. number of simulations to run - see here for a complete list of configuration options).
To get the results for the topology mutations for CASCADE 2.0 you need to change the ags_cascade_2.0/config
file option topology_mutations: 10
and balance_mutations: 0
(the default options are \(0\) topology mutations and \(3\) link-operator/balance mutations).
If you wish to get the results using both kinds of mutation, set both topology_mutations
and balance_mutations
options to a non-zero value (\(10\) and \(3\) were used in the simulations).
So, for example to get the simulation output directories for the Cascade 1.0 Analysis I just run the run_druglogics_synergy.sh
script with the following options defined in the loops inside (no need to change any further configuration):
cascade_version
:1.0
(which topology to use)train
:ss rand
(train to the AGS steady state or to a (random) proliferation phenotype))sim_num
:50
(number of simulations)attr_tool
:fixpoints
(attractor tool, common across all report)synergy_method
:hsa bliss
(synergy calculation method used bydrabme
)
Each subsequent druglogics-synergy
execution results in an output directory and the files of interest (which are used to produce the ROC and PR curves in this report and the AUC sensitivity figures) are the modelwise_synergies.tab
and the ensemble_synergies.tab
respectively.
For the fitness evolution figures we used the summary.txt
file of the corresponding simulations.
Specifically, the results described above are stored in the compressed Zenodo file sim_res.tar.gz
.
When uncompressed, the sim_res.tar.gz
file outputs 2 separate directories, one per different topology (CASCADE 1.0 and CASCADE 2.0).
The directory with the CASCADE 2.0 related results has 3 subsequent directories, corresponding to the different parameterization that was used in the simulations (link mutations, topology mutations or both).
Each further directory, specifies on its name the training type, simulation number, attractor tool and synergy assessment method.
Fitness vs Performance Methods
Generate the training data samples
Use the gen_training_data.R script to produce the training data samples. In this script we first choose \(11\) numbers that represent the number of nodes that are going to be flipped in the AGS steady state. These numbers range from \(1\) (flip just one node) to \(24\) (flip all nodes, i.e. create a complete reversed steady state). Then, for each such number, we generate \(20\) new partially correct steady states, each one having the same amount of randomly-chosen flips in the steady state (e.g. \(20\) steady states where randomly-chosen sets of \(3\) nodes have been flipped). Thus, in total, \(205\) training data sample files are produced (\(205 = 9 \times 20 + 1 \times 24 + 1 \times 1\), where from the \(11\) number of flips, the one flip happens for every node (\(24\) different steady states) and flipping all the nodes generates the unique completely reversed steady state).
The training data files are stored in the Zenodo file training-data-files.tar.gz
.
Run model ensembles simulations
To generate the calibrated model ensembles and perform the drug response analysis on them we use the script run_druglogics_synergy_training.sh from the druglogics-synergy repository root (version 1.2.1
: git checkout v1.2.1
).
Note that the training-data-files
directory must be placed inside the druglogics-synergy
root directory before executing the aforementioned script.
The end result we get is the simulation results for each of the training data files (a different directory per training data file).
The following changes need to be applied to the CASCADE 1.0 or 2.0 configuration file (depends on the topology you are using, the files are either druglogics-synergy/ags_cascade_1.0/config
or druglogics-synergy/ags_cascade_2.0/config
) before executing the script (some are done automatically in the script):
- If topology mutations are used, disable the link-operator mutations (
balance_mutations: 0
) and usetopology_mutations: 10
. - Change the number of simulations to \(20\) (link-operator mutations) or \(50\) (topology mutations) for CASCADE 2.0 and to \(50\) for CASCADE 1.0 (default value, link-operator mutations).
- Change to Bliss synergy method (
synergy_method: bliss
) no matter the mutations used or topology.
The results of the CASCADE 2.0 link-operator mutated model simulations are stored in the Zenodo file fit-vs-performance-results-bliss.tar.gz
, whereas for the CASCADE 2.0 topology mutated models, in the fit-vs-performance-results-bliss-topo.tar.gz
file.
The results of the CASCADE 2.0 link-operator mutated model simulations are stored in the Zenodo file fit-vs-performance-results-bliss-cascade1.tar.gz
.
To parse and tidy up the data from the simulations, use the scripts fit_vs_perf_cascade2_lo.R (for the link-operator-based CASCADE 2.0 simulations), fit_vs_perf_cascade2_topo.R (for the topology-mutation-based CASCADE 2.0 simulations) and fit_vs_perf_cascade1_lo.R (for the link-operator-based CASCADE 1.0 simulations).
Also, we used the run_druglogics_synergy.sh
script at the root of the druglogics-synergy
(script configuration for CASCADE 2.0: {2.0, prolif, 150, fixpoints, bliss}
and for CASCADE 1.0: {1.0, prolif, 50, fixpoints, bliss}
) repo to get the ensemble results of the random (proliferative) models that we will use to normalize the calibrated model performance.
The result of this simulation is also part of the results described above (see section above, also considering the necessary changes applied for the topology mutation-based simulations for CASCADE 2.0) and it’s available inside the file sim_res.tar.gz
of the Zenodo dataset (also available in the results directory - see Repo results structure).
Random Model Bootstrap
- Install the druglogics-synergy module and use the version
1.2.1
:git checkout v1.2.1
(or use directly the released v1.2.1 package) - Run the the script run_gitsbe_random.sh inside the
ags_cascade_2.0
directory of thedruglogics-synergy
repository. This creates a results directory which includes amodels
directory, with a total of \(3000\)gitsbe
models which we are going to use for the bootstrapping. - Place the
models
directory inside theags_cascade_2.0
directory. - Execute the bootstrap_models_drabme.sh inside the
druglogics-synergy/ags_cascade_2.0
directory. Change appropriately theconfig
file to havesynergy_method: bliss
. The bootstrap configuration consists of \(20\) batches, each one consisting of a sample of \(100\) randomly selected models from the model directory pool. - Use the script random_model_boot.R to tidy the data from the simulations.
The results of the simulations are stored in the random_model_bootstrap.tar.gz
file of the Zenodo dataset.
Parameterization Bootstrap
- Install the druglogics-synergy module and use the version
1.2.1
:git checkout v1.2.1
(or use directly the released v1.2.1 package) - To generate the \(3\) pools of calibrated models (fitting to the AGS steady state) subject to different normalization schemes, run the script run_gitsbe_param.sh inside the
ags_cascade_2.0
directory of thedruglogics-synergy
repository root. This will generate the directories:gitsbe_link_only_cascade_2.0_ss
gitsbe_topology_only_cascade_2.0_ss
gitsbe_topo_and_link_cascade_2.0_ss
, each of which have amodels
directory (the model pool)
- Repeat for each different pool (
models
directory):- Place the
models
directory inside theags_cascade_2.0
directory of thedruglogics-synergy
repository root. - Use the bootstrap_models_drabme.sh script, while changing the following configuration:
batches=25
,batch_size=300
and theproject
variable name (input toeu.druglogics.drabme.Launcher
) as one of the three:--project=link_only_cascade_2.0_ss_bliss_batch_${batch}
--project=topology_only_cascade_2.0_ss_bliss_batch_${batch}
--project=topo_and_link_cascade_2.0_ss_bliss_batch_${batch}
models
pool. Also change appropriately theconfig
file to havesynergy_method: bliss
. - Place the
The results of all these simulations are stored in the parameterization-comp.tar.gz
Zenodo file.
Use the script get_param_comp_boot_data.R to tidy up the simulation data to a nice table format.
When uncompressed, the parameterization-comp.tar.gz
file outputs 3 separate directories, one per parameterization scheme.
Each separate directory is structured so as to contain the gitsbe
simulation results with the model pool inside (result of the script run_gitsbe_param.sh), a boot_res
directory (includes the results of the bootstrap_models_drabme.sh script) and lastly the results of the random proliferative model simulations which can be reproduced following the guidelines above.
ERK investigation
We split the link operator model pool (\(4500\) models, see above) to \(2\) pools, one with a total of \(2764\) models that have ERK_f
active and one with a total of \(1736\) models that have it inhibited in the corresponding stable states.
The two model pools are the two directories named erk_active_pool
and erk_inhibited_pool
respectively inside the Zenodo file erk_perf_investigation.tar.gz
.
Then:
- Install the druglogics-synergy module and use the version
1.2.1
:git checkout v1.2.1
(or use directly the released v1.2.1 package) - Run the script bootstrap_models_drabme_erk_pools.sh inside the
ags_cascade_2.0
directory of thedruglogics-synergy
repository. This will produce the drug combination prediction results for the bootstrapped ensembles of boolean models from each pool. From each pool, we bootstrapped \(35\) ensembles with \(300\) models each and used thebliss
drabme synergy method to calculate the prediction results. - Run the script erk_perf_tidy_data.R to calculate the ROC and PR AUC of every bootstrapped ensemble, subject to normalization against the random proliferative model predictions.
CASCADE 1.0 Calibrated Models bootstrap
- Install the druglogics-synergy module and use the version
1.2.1
:git checkout v1.2.1
(or use directly the released v1.2.1 package) - Generate one large pool of calibrated models (fitting to the AGS steady state) by using the instructions above => use the
run_druglogics_synergy.sh
script at the root of thedruglogics-synergy
repo with script config:{1.0, ss, 1000, fixpoints, bliss}
- Use the bootstrap_models_drabme_cascade1.sh script to run the bootstrapped model simulations.
- Use the get_syn_res_boot_ss_cascade1.R script to tidy up the bootstrap simulation data.
The results from the bootstrap simulations are stored in the ss_cascade1_model_bootstrap.tar.gz
file of the Zenodo dataset.
Repo results structure
We have gathered all the necessary output files from the above simulations (mostly relating to ROC, PR curves and AUC sensitivity figures) to the directory results
for ease of use in our report.
The results
directory has 3 main sub-directories:
link-only
: results from the link-operator mutated models only (used in the sections Cascade 1.0 Analysis and CASCADE 2.0 Analysis (Link Operator Mutations))topology-only
: results from the topology-mutated models only (used in the section CASCADE 2.0 Analysis (Topology Mutations))topo-and-link
: results where both mutations applied to the generated boolean models (used in section CASCADE 2.0 Analysis (Topology and Link Operator Mutations))
In addition, there is a data
directory that includes the following:
observed_synergies_cascade_1.0
: the gold-standard synergies for the CASCADE 1.0 topology (Flobak et al. 2015)observed_synergies_cascade_2.0
: the gold-standard synergies for the CASCADE 2.0 topology (Flobak et al. 2019)steadystate
,steadystate.rds
: the AGS training data for the calibrated models (file + compressed data) - see lo_mutated_models_heatmaps.R script.edge_mat.rds
,topo_ss_df.rds
: heatmap data for the topology-mutation models - see lo_mutated_models_heatmaps.R script.lo_df.rds
,lo_ss_df.rds
: heatmap data for the link-operator models - see topo_mutated_models_heatmaps.R script.node_pathway_annotations_cascade2.csv
,node_path_tbl.rds
: node pathway annotation data for CASCADE 2.0 and compressed data table produced via the node_path_annot_cascade2.R script.cosmic_cancer_gene_census_all_29102020.tsv
: Cancer Gene Census COSMIC data downloaded from https://cancer.sanger.ac.uk/census (for academic purposes)cosmic_tbl.rds
: a compressed file with atibble
object having the CASCADE 2.0 nodes and their respective COSMIC cancer role annotation (see get_cosmic_data_annot.R script).bootstrap_rand_res.rds
: a compressed file with atibble
object having the result data in a tidy format for the analysis related to the Bootstrap Random Model AUC section.res_fit_aucs_cascade1.rds
: a compressed file with atibble
object having the result data in a tidy format for the analysis related to the Fitness vs Ensemble Performance section (CASCADE 1.0, link operator mutations).res_fit_aucs.rds
: a compressed file with atibble
object having the result data in a tidy format for the analysis related to the Fitness vs Ensemble Performance section (CASCADE 2.0, link operator mutations).res_fit_aucs_topo.rds
: a compressed file with atibble
object having the result data in a tidy format for the analysis related to the Fitness vs Ensemble Performance section (CASCADE 2.0, topology mutations).res_param_boot_aucs.rds
: a compressed file with atibble
object having the result data in a tidy format for the analysis related to the Bootstrap Simulations section.boot_cascade1_res.rds
: a compressed file with atibble
object having the result data from executing the script get_syn_res_boot_ss_cascade1.R, related to the scrambled topologies investigation in CASCADE 1.0.scrambled_topo_res_cascade1.rds
: a compressed file with atibble
object having the result data from executing the script get_syn_res_scrambled_topo_cascade1.R, related to the scrambled topologies investigation in CASCADE 1.0.scrambled_topo_res_cascade2.rds
: a compressed file with atibble
object having the result data from executing the script get_syn_res_scrambled_topo_cascade2.R, related to the scrambled topologies investigation in CASCADE 2.0.res_erl.rds
: a compressed file with atibble
object having the result data from executing the script erk_perf_tidy_data.R, related to the ERK analysis with the link operator mutated models in CASCADE 2.0.tumor_vol_data.csv
: the data from the xenograft experiments relating to thePI
and5Z
inhibitors.
The AUC sensitivity plots across the report are also included↩︎