Skip to content

Perturbation effect adjustment function with tf relationships boolean logic

Adjust the mean of the perturbation effect based on the enrichment score and the provided binary / boolean or unary relationships between TFs. For each gene, the mean of the TF-gene pair’s perturbation effect will be adjusted if the TF is bound to the gene and all of the Relations associated with the TF are satisfied (ie they evaluate to True). These relations could be unary conditions or Ands or Ors between TFs. A TF being bound corresponds to a true value, which means And(4, 5) would be satisfied is both TF 4 and TF 5 are bound to the gene in question. The adjustment will be a random value not exceeding the maximum adjustment.

Parameters:

Name Type Description Default
binding_enrichment_data Tensor

A tensor of enrichment scores for each gene with dimensions [n_genes, n_tfs, 3] where the entries in the third dimension are a matrix with columns [label, enrichment, pvalue].

required
bound_mean float

The mean for bound genes.

required
unbound_mean float

The mean for unbound genes.

required
max_adjustment float

The maximum adjustment to the base mean based on enrichment.

required
tf_relationships dict[int, list[Relation]]

A dictionary where the keys are TF indices and the values are lists of Relation objects that represent the conditions that must be met for the mean of the perturbation effect associated with the TF-gene pair to be adjusted.

required

Returns:

Type Description
torch.Tensor

Adjusted mean as a tensor.

Raises:

Type Description
ValueError

If tf_relationships is not a dictionary between ints and lists of Relations

ValueError

If the tf_relationships dict does not have the same number of TFs as the binding_data tensor passed into the function

ValueError

If the tf_relationships dict has any TFs in the values that are not also in the keys or any key or value TFs that are out of bounds for the binding_data tensor

Source code in yeastdnnexplorer/probability_models/generate_data.py
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
def perturbation_effect_adjustment_function_with_tf_relationships_boolean_logic(
    binding_enrichment_data: torch.Tensor,
    bound_mean: float,
    unbound_mean: float,
    max_adjustment: float,
    tf_relationships: dict[int, list[Relation]],
) -> torch.Tensor:
    """
    Adjust the mean of the perturbation effect based on the enrichment score and the
    provided binary / boolean or unary relationships between TFs. For each gene, the
    mean of the TF-gene pair's perturbation effect will be adjusted if the TF is bound
    to the gene and all of the Relations associated with the TF are satisfied (ie they
    evaluate to True). These relations could be unary conditions or Ands or Ors between
    TFs. A TF being bound corresponds to a true value, which means And(4, 5) would be
    satisfied is both TF 4 and TF 5 are bound to the gene in question. The adjustment
    will be a random value not exceeding the maximum adjustment.

    :param binding_enrichment_data: A tensor of enrichment scores for each gene with
        dimensions [n_genes, n_tfs, 3] where the entries in the third dimension are a
        matrix with columns [label, enrichment, pvalue].
    :type binding_enrichment_data: torch.Tensor
    :param bound_mean: The mean for bound genes.
    :type bound_mean: float
    :param unbound_mean: The mean for unbound genes.
    :type unbound_mean: float
    :param max_adjustment: The maximum adjustment to the base mean based on enrichment.
    :type max_adjustment: float
    :param tf_relationships: A dictionary where the keys are TF indices and the values
        are lists of Relation objects that represent the conditions that must be met for
        the mean of the perturbation effect associated with the TF-gene pair to be
        adjusted.
    :type tf_relationships: dict[int, list[Relation]]
    :return: Adjusted mean as a tensor.
    :rtype: torch.Tensor
    :raises ValueError: If tf_relationships is not a dictionary between ints and lists
        of Relations
    :raises ValueError: If the tf_relationships dict does not have the same number of
        TFs as the binding_data tensor passed into the function
    :raises ValueError: If the tf_relationships dict has any TFs in the values that are
        not also in the keys or any key or value TFs that are out of bounds for the
        binding_data tensor

    """
    if (
        not isinstance(tf_relationships, dict)
        or not all(isinstance(v, list) for v in tf_relationships.values())
        or not all(isinstance(k, int) for k in tf_relationships.keys())
        or not all(
            isinstance(i, Relation) for v in tf_relationships.values() for i in v
        )
    ):
        raise ValueError(
            "tf_relationships must be a dictionary between \
                ints and lists of Relation objects"
        )
    if not all(
        k in range(binding_enrichment_data.shape[1]) for k in tf_relationships.keys()
    ):
        raise ValueError(
            "all TFs mentioned in tf_relationships must be within \
                the bounds of the binding_data tensor's number of TFs"
        )
    if not len(tf_relationships) == binding_enrichment_data.shape[1]:
        raise ValueError(
            "tf_relationships must have the same number of TFs as \
                the binding_data tensor passed into the function"
        )

    # Extract bound/unbound labels and enrichment scores
    bound_labels = binding_enrichment_data[:, :, 0]  # shape: (num_genes, num_tfs)
    enrichment_scores = binding_enrichment_data[:, :, 1]  # shape: (num_genes, num_tfs)

    # we set all unbound scores to 0, then we will go through and also set any
    # bound scores to unbound_mean if the related boolean statements are not satisfied
    adjusted_mean_matrix = torch.where(
        bound_labels == 1, enrichment_scores, torch.zeros_like(enrichment_scores)
    )  # shape: (num_genes, num_tfs)

    for gene_idx in range(bound_labels.shape[0]):
        for tf_index, relations in tf_relationships.items():
            # check if all relations (boolean relationships)
            # associated with TFs are satisfied
            if bound_labels[gene_idx, tf_index] == 1 and all(
                relation.evaluate(bound_labels[gene_idx].tolist())
                for relation in relations
            ):
                # OLD: adjustment_multiplier = torch.rand(1)
                # divide its enrichment score by the maximum magnitude possible to
                # create an adjustment multipler that scales with increasing enrichment
                adjustment_multiplier = enrichment_scores[gene_idx, tf_index] / abs(
                    enrichment_scores.max()
                )

                # randomly adjust the gene by some portion of the max adjustment
                adjusted_mean_matrix[gene_idx, tf_index] = bound_mean + (
                    adjustment_multiplier * max_adjustment
                )
            else:
                # related tfs are not all bound, set the enrichment score to unbound
                # mean
                adjusted_mean_matrix[gene_idx, tf_index] = unbound_mean

    return adjusted_mean_matrix  # shape (num_genes, num_tfs)