Skip to content

Probability Models

Default function to adjust the mean of the perturbation effect based on the enrichment score.

All functions that are passed to generate_perturbation_effects() in the argument adjustment_function must have the same signature as this function.

Parameters:

Name Type Description Default
binding_enrichment_data Tensor

A tensor of enrichment scores for each gene with dimensions [n_genes, n_tfs, 3] where the entries in the third dimension are a matrix with columns [label, enrichment, pvalue].

required
bound_mean float

The mean for bound genes.

required
unbound_mean float

The mean for unbound genes.

required
max_adjustment float

The maximum adjustment to the base mean based on enrichment.

required
tf_relationships dict[int, list[int]], optional

Unused in this function. It is only here to match the signature of the other adjustment functions.

required

Returns:

Type Description
torch.Tensor

Adjusted mean as a tensor.

Source code in yeastdnnexplorer/probability_models/generate_data.py
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
def default_perturbation_effect_adjustment_function(
    binding_enrichment_data: torch.Tensor,
    bound_mean: float,
    unbound_mean: float,
    max_adjustment: float,
    **kwargs,
) -> torch.Tensor:
    """
    Default function to adjust the mean of the perturbation effect based on the
    enrichment score.

    All functions that are passed to generate_perturbation_effects() in the argument
    adjustment_function must have the same signature as this function.

    :param binding_enrichment_data: A tensor of enrichment scores for each gene with
        dimensions [n_genes, n_tfs, 3] where the entries in the third dimension are a
        matrix with columns [label, enrichment, pvalue].
    :type binding_enrichment_data: torch.Tensor
    :param bound_mean: The mean for bound genes.
    :type bound_mean: float
    :param unbound_mean: The mean for unbound genes.
    :type unbound_mean: float
    :param max_adjustment: The maximum adjustment to the base mean based on enrichment.
    :type max_adjustment: float
    :param tf_relationships: Unused in this function. It is only here to match the
        signature of the other adjustment functions.
    :type tf_relationships: dict[int, list[int]], optional
    :return: Adjusted mean as a tensor.
    :rtype: torch.Tensor

    """
    # Extract bound/unbound labels and enrichment scores
    bound_labels = binding_enrichment_data[:, :, 0]
    enrichment_scores = binding_enrichment_data[:, :, 1]

    adjusted_mean_matrix = torch.where(
        bound_labels == 1, enrichment_scores, torch.zeros_like(enrichment_scores)
    )

    for gene_idx in range(bound_labels.shape[0]):
        for tf_index in range(bound_labels.shape[1]):
            if bound_labels[gene_idx, tf_index] == 1:
                # divide its enrichment score by the maximum magnitude possible to
                # create an adjustment multipler that scales with increasing enrichment
                adjustment_multiplier = enrichment_scores[gene_idx, tf_index] / abs(
                    enrichment_scores.max()
                )

                # randomly adjust the gene by some portion of the max adjustment
                adjusted_mean_matrix[gene_idx, tf_index] = bound_mean + (
                    adjustment_multiplier * max_adjustment
                )
            else:
                # related tfs are not all bound, so set the enrichment
                # score to unbound mean
                adjusted_mean_matrix[gene_idx, tf_index] = unbound_mean

    return adjusted_mean_matrix