Inputs

PyDrugLogics requires various inputs to train Boolean Models and predict drug synergies. These inputs define the interaction model, training data, model outputs, perturbations, and observed synergy scores, each playing an essential role in the software pipeline.

To see the full Jupyter Notebook tutorial click here.

Below is a detailed guide to each input type and its structure.

Input Files

PyDrugLogics supports the following input files to load or construct Boolean Models:

Boolean Model from .sif File

The .sif file format represents network interactions using activation and inhibition relationships. Each row defines an interaction between nodes in the network.

Key notations:

  • ->: Activation

  • -|: Inhibition

Example:

A -> B
C -> A
B -| C

Code Example for Loading Interactions for construction a Boolean Model

# Initialize InteractonModel
from pydruglogics.model.InteractionModel import InteractionModel

interaction_model = InteractionModel(interactions_file='./path/to/network.sif', model_name='',
remove_self_regulated_interactions=False, remove_inputs=False, remove_outputs=False)

# Initialize BooleanModel
from pydruglogics.model.BooleanModel import BooleanModel

boolean_model = BooleanModel(model=model, model_name='test', mutation_type='balanced',
attractor_tool='mpbn', attractor_type='trapspaces')

Note

A .sif file defines one Boolean Model.

Boolean Model from .bnet File

The .bnet format is used for defining a Boolean network, where nodes represent variables, and their activation expressions define relationships and dependencies among them. Each node’s state is determined by logical expressions. Logical operators are used to specify relationships:

  • &: Conjunction

  • |: Disjunction

  • !: Negation

Example:

A, (B) & !(C)
B, ((A) | C)
C, (B)

Code Example for Loading Model from .bnet

from pydruglogics.model.BooleanModel import BooleanModel

model = BooleanModel(file='./path/to/network.bnet', model_name='test', mutation_type='balanced',
attractor_tool='mpbn', attractor_type='stable_states')

Note

A .bnet file defines one Boolean Model.

Training Data

The training data file contains condition-response pairs, and a weight that are essential for evaluating the performance of Boolean models during the genetic algorithm’s evolutionary process. A fitness score is calculated for the condition(s)-response(s), reflecting how well (0-worst, 1-best) the model aligns with the training data.

Format

Each observation consists of:

  1. Condition: -

  2. Response: Specifies the node activity levels as a tab-separated list.

3. Weight: Once the fitness values have been calculated, this value is used to weight each condition-response pair and calculate the overall weighted average fitness score of the model fitted to the training data

Types of Training Data

  1. Unperturbed Condition - Steady State Response

This training type describes the system’s steady state, where activity values are assigned to nodes in the range [0, 1]. These values represent the observed state of the system and are compared against the model’s attractors to calculate fitness.

Example:

Condition
-
Response
A:0 B:1 C:0 D:0.453
Weight:1
  1. Unperturbed Condition - Global Output Response

This training type specifies the system’s behavior under no perturbation, typically used for studying proliferation in the networks. The response is defined as globaloutput:<value> in the range [0, 1], with fitness calculated based on how close the predicted global output is to the observed value.

Example:

Condition
-
Response
globaloutput:1
Weight:1

Initialization Options

1. Load Training Data from File

This method allows loading training data directly from a file. The file can be in a format such as training_data.tab or training_data, containing input in a format like this:

# training data
Condition
-
Response
A:0 B:0     C:0
Weight:1

Where the responses are tab-separated.

Example:

from pydruglogics.input.TrainingData import TrainingData

training_data = TrainingData(input_file='./path/to/training')

2. Direct Initialization

This method initializes the training data using Python data structures. The responses and weights are provided as a list of tuples.

Example:

from pydruglogics.input.TrainingData import TrainingData

observations = [(["A:1", "B:0", "C:0.5"], 1)]
training_data = TrainingData(observations=observations)

Model Outputs

The model outputs defines network nodes and their integer weights, determining their contribution to global signaling outputs (e.g., cell proliferation or death).

Format

Each model output contains:

  1. Node name: string value.

  2. Weight (positive for proliferation, negative for death): continuous numeric value.

Initialization Options

1. Load Model Outputs from File

This method allows loading model outputs directly from a file. The file can be in a format such as modeloutputs.tab or modeloutputs, containing input in a format like this:

# Name   Weight
A  1.0
B  -1.0
C  -2.0

Where the names and weights are tab-separated.

Example:

from pydruglogics.input.ModelOutputs import ModelOutputs

model_outputs = ModelOutputs(input_file='./path/to/modeloutputs')

2. Direct Initialization

This method initializes the model outputs using Python data structures. Outputs are provided as a dictionary, with keys representing node names and values representing their corresponding output values.

Example:

from pydruglogics.input.ModelOutputs import ModelOutputs

model_outputs_dict = {
    "A": 1.0,
    "B": -1.0,
    "C": -2.0
}
model_outputs = ModelOutputs(input_dictionary=model_outputs_dict)

Perturbations

Perturbations combine drugs applied to the system. The perturbations list contains all drug combinations to be tested. The drug data contains the effect of each drug on the nodes.

Note

Only 1- and 2-drug perturbations are allowed. Perturbations with more than two drugs are not supported.

Initialization From Dictionary

You can define both drug_data and perturbation_data, or just drug_data:

1. Define `drug_data` and `perturbation_data`: Provide a list of drugs, where each drug entry specifies:

  1. Drug data:

  • Drug name: Unique name of the drug.

  • Target(s): The node(s) in the network affected by the drug.

  • Effect: This specifies how the drug influences the target and can take the following values:
    • activates: The drug increases the target’s activity.

    • inhibits: The drug decreases the target’s activity (this is the default if no effect is specified).

  1. Perturbation data:

  • Perturbations: One or two-drug combinations. The pipeline handles only single and tro-drug combinations.

If both drug_data and perturbation_data are defined, the explicitly provided perturbations will be used.

Example:

# Define drug_data
drug_data = [
    ['PI', 'A', 'inhibits'],     # PI inhibits target A
    ['PD', 'B', 'activates'],    # PD activates target B
    ['CT', 'C, D, E'],           # CT inhibits targets C, D, and E
    ['BI', 'F, G'],              # BI inhibits targets F and G
    ['PK', 'H'],                 # PK inhibits target H
    ['AK', 'I']                  # Ak inhibits target I
]

# Define perturbation_data
perturbation_data = [
    ['PI'],
    ['PD'],
    ['CT'],
    ['BI'],
    ['PK'],
    ['AK'],
    ['PI', 'PD'],
    ['PD', 'PK'],
    ['CT', 'AK'],
    ['BI', 'PK'],
    ['BI', 'AK']

perturbations = Perturbation(drug_data=drug_data, perturbation_data=perturbation_data)
  1. Define only `drug_data`:

If no perturbation_data is provided, the pipeline will automatically generate all possible two-drug combinations from the drug_data.

Example:

drug_data = [
    ['PI', 'A', 'inhibits'],     # PI inhibits target A
    ['PD', 'B', 'activates'],    # PD activates target B
    ['CT', 'C, D, E'],           # CT inhibits targets C, D, and E
    ['BI', 'F, G'],              # BI inhibits targets F and G
    ['PK', 'H'],                 # PK inhibits target H
    ['AK', 'I']                  # Ak inhibits target I
]

perturbations = Perturbation(drug_data=drug_data)

Note

  • If perturbation_data is not provided, it will be automatically calculated to include all drug combinations from the drug_data.

  • The effect field in drug_data is optional. If omitted, the pipeline assumes the effect is inhibits.

  • Multiple targets can be specified for a single drug by listing them in the Target(s) field, separated by commas.

  • Valid options for the effect field are: activates and inhibits.

Observed Synergy Scores

Observed synergy scores are ground truth data used to evaluate model predictions. They are typically derived from experimental datasets or literature sources.

Example:

observed_synergy_scores = ["PI-PD", "PD-AK"]