Inputs ====== PyDrugLogics requires various inputs to train Boolean Models and predict drug synergies. These inputs define the interaction model, training data, model outputs, perturbations, and observed synergy scores, each playing an essential role in the software pipeline. To see the full Jupyter Notebook tutorial click `here `_. Below is a detailed guide to each input type and its structure. .. _input_files: Input Files ----------- PyDrugLogics supports the following input files to load or construct Boolean Models: Boolean Model from .sif File ~~~~~~~~~~ The `.sif` file format represents network interactions using activation and inhibition relationships. Each row defines an interaction between nodes in the network. **Key notations:** - `->`: Activation - `-|`: Inhibition **Example:** .. code-block:: text :class: copybutton A -> B C -> A B -| C **Code Example for Loading Interactions for construction a Boolean Model** .. code-block:: python :class: copybutton # Initialize InteractonModel from pydruglogics.model.InteractionModel import InteractionModel interaction_model = InteractionModel(interactions_file='./path/to/network.sif', model_name='', remove_self_regulated_interactions=False, remove_inputs=False, remove_outputs=False) # Initialize BooleanModel from pydruglogics.model.BooleanModel import BooleanModel boolean_model = BooleanModel(model=model, model_name='test', mutation_type='balanced', attractor_tool='mpbn', attractor_type='trapspaces') .. note:: A `.sif` file defines one Boolean Model. Boolean Model from .bnet File ~~~~~~~~~~~ The `.bnet` format is used for defining a Boolean network, where nodes represent variables, and their activation expressions define relationships and dependencies among them. Each node's state is determined by logical expressions. Logical operators are used to specify relationships: - `&`: Conjunction - `|`: Disjunction - `!`: Negation **Example:** .. code-block:: text :class: copybutton A, (B) & !(C) B, ((A) | C) C, (B) **Code Example for Loading Model from .bnet** .. code-block:: python :class: copybutton from pydruglogics.model.BooleanModel import BooleanModel model = BooleanModel(file='./path/to/network.bnet', model_name='test', mutation_type='balanced', attractor_tool='mpbn', attractor_type='stable_states') .. note:: A `.bnet` file defines one Boolean Model. .. _training_data: Training Data ------------- The training data file contains condition-response pairs, and a weight that are essential for evaluating the performance of Boolean models during the genetic algorithm's evolutionary process. A fitness score is calculated for the condition(s)-response(s), reflecting how well (0-worst, 1-best) the model aligns with the training data. Format ~~~~~~~ Each observation consists of: 1. **Condition**: - 2. **Response**: Specifies the node activity levels as a tab-separated list. 3. **Weight**: Once the fitness values have been calculated, this value is used to weight each condition-response pair and calculate the overall weighted average fitness score of the model fitted to the training data Types of Training Data ~~~~~~~~~~~~~~~~~~~~~~ 1. **Unperturbed Condition - Steady State Response** This training type describes the system's steady state, where activity values are assigned to nodes in the range [0, 1]. These values represent the observed state of the system and are compared against the model's attractors to calculate fitness. Example: .. code-block:: text :class: copybutton Condition - Response A:0 B:1 C:0 D:0.453 Weight:1 2. **Unperturbed Condition - Global Output Response** This training type specifies the system's behavior under no perturbation, typically used for studying proliferation in the networks. The response is defined as `globaloutput:` in the range [0, 1], with fitness calculated based on how close the predicted global output is to the observed value. Example: .. code-block:: text :class: copybutton Condition - Response globaloutput:1 Weight:1 Initialization Options ~~~~~~~~~~~~~~~~~~~~~~ **1. Load Training Data from File** This method allows loading training data directly from a file. The file can be in a format such as `training_data.tab` or `training_data`, containing input in a format like this: .. code-block:: text # training data Condition - Response A:0 B:0 C:0 Weight:1 Where the responses are tab-separated. Example: .. code-block:: python :class: copybutton from pydruglogics.input.TrainingData import TrainingData training_data = TrainingData(input_file='./path/to/training') **2. Direct Initialization** This method initializes the training data using Python data structures. The responses and weights are provided as a list of tuples. Example: .. code-block:: python :class: copybutton from pydruglogics.input.TrainingData import TrainingData observations = [(["A:1", "B:0", "C:0.5"], 1)] training_data = TrainingData(observations=observations) .. _model_outputs: Model Outputs ------------- The `model outputs` defines network nodes and their integer weights, determining their contribution to global signaling outputs (e.g., cell proliferation or death). Format ~~~~~~~ Each model output contains: 1. **Node name**: string value. 2. **Weight** (positive for proliferation, negative for death): continuous numeric value. Initialization Options ~~~~~~~~~~~~~~~~~~~~~~ **1. Load Model Outputs from File** This method allows loading model outputs directly from a file. The file can be in a format such as `modeloutputs.tab` or `modeloutputs`, containing input in a format like this: .. code-block:: text :class: copybutton # Name Weight A 1.0 B -1.0 C -2.0 Where the names and weights are tab-separated. Example: .. code-block:: python :class: copybutton from pydruglogics.input.ModelOutputs import ModelOutputs model_outputs = ModelOutputs(input_file='./path/to/modeloutputs') **2. Direct Initialization** This method initializes the model outputs using Python data structures. Outputs are provided as a dictionary, with keys representing node names and values representing their corresponding output values. Example: .. code-block:: python :class: copybutton from pydruglogics.input.ModelOutputs import ModelOutputs model_outputs_dict = { "A": 1.0, "B": -1.0, "C": -2.0 } model_outputs = ModelOutputs(input_dictionary=model_outputs_dict) .. _perturbations: Perturbations ------------- Perturbations combine drugs applied to the system. The perturbations list contains all drug combinations to be tested. The drug data contains the effect of each drug on the nodes. .. note:: Only 1- and 2-drug perturbations are allowed. Perturbations with more than two drugs are not supported. Initialization From Dictionary ~~~~~~~~~~~~~~~~~~~~~~ You can define both `drug_data` and `perturbation_data`, or just `drug_data`: 1. **Define `drug_data` and `perturbation_data`:** Provide a list of drugs, where each drug entry specifies: a. **Drug data:** - **Drug name**: Unique name of the drug. - **Target(s)**: The node(s) in the network affected by the drug. - **Effect**: This specifies how the drug influences the target and can take the following values: - `activates`: The drug increases the target's activity. - `inhibits`: The drug decreases the target's activity (this is the default if no effect is specified). b. **Perturbation data:** - **Perturbations**: One or two-drug combinations. The pipeline handles only single and tro-drug combinations. If both `drug_data` and `perturbation_data` are defined, the explicitly provided perturbations will be used. Example: .. code-block:: python :class: copybutton # Define drug_data drug_data = [ ['PI', 'A', 'inhibits'], # PI inhibits target A ['PD', 'B', 'activates'], # PD activates target B ['CT', 'C, D, E'], # CT inhibits targets C, D, and E ['BI', 'F, G'], # BI inhibits targets F and G ['PK', 'H'], # PK inhibits target H ['AK', 'I'] # Ak inhibits target I ] # Define perturbation_data perturbation_data = [ ['PI'], ['PD'], ['CT'], ['BI'], ['PK'], ['AK'], ['PI', 'PD'], ['PD', 'PK'], ['CT', 'AK'], ['BI', 'PK'], ['BI', 'AK'] perturbations = Perturbation(drug_data=drug_data, perturbation_data=perturbation_data) 2. **Define only `drug_data`:** If no `perturbation_data` is provided, the pipeline will automatically generate all possible two-drug combinations from the `drug_data`. Example: .. code-block:: python :class: copybutton drug_data = [ ['PI', 'A', 'inhibits'], # PI inhibits target A ['PD', 'B', 'activates'], # PD activates target B ['CT', 'C, D, E'], # CT inhibits targets C, D, and E ['BI', 'F, G'], # BI inhibits targets F and G ['PK', 'H'], # PK inhibits target H ['AK', 'I'] # Ak inhibits target I ] perturbations = Perturbation(drug_data=drug_data) .. note:: - If `perturbation_data` is not provided, it will be automatically calculated to include all drug combinations from the `drug_data`. - The `effect` field in `drug_data` is optional. If omitted, the pipeline assumes the effect is `inhibits`. - Multiple targets can be specified for a single drug by listing them in the `Target(s)` field, separated by commas. - Valid options for the `effect` field are: `activates` and `inhibits`. .. _observed_synergy_scores: Observed Synergy Scores ------------------------ Observed synergy scores are ground truth data used to evaluate model predictions. They are typically derived from experimental datasets or literature sources. Example: .. code-block:: python :class: copybutton observed_synergy_scores = ["PI-PD", "PD-AK"]