decoupler.mt.viper#
- decoupler.mt.viper = <decoupler._Method.Method object>#
Virtual Inference of Protein-activity by Enriched Regulon analysis (VIPER) [ASG+16].
This approach first ranks features based on their absolute values and computes a one-tail score.
Where:
is a vector of interaction weights across features
is a vector of interaction likelihoods across features
is a vector of quantiles based on the molecular readouts across features
is the number of features in
is is the inverse of the cumulative distribution function of the standard normal distribution
are the z-scores of the deviation of quantiles from 0.5
encodes for the magnitude of the enrichment score, irrespective of the interaction signs in
net.Then, are z-transformed and weighted by their interaction strength and likelihood.
In this case, takes the direction (sign) of interactions into consideration.
Afterwards, a summary score is obtained.
An enrichment score is obtained by comparing to a null model generated through an analytical approach that shuffles features.
Together with a
Additionaly, computing multiple sources simultaneously, a pleiotropic correction is employed.
In brief, all possible pairs of sources AB are generated under two conditions:
both A and B are significantly enriched (p <
reg_sign=0.05)they share at least
n_targets=10features
Subsequently, a and its associated is computed for both A () and B () based only on the shared features. Then the pleiotropy score () is computed.
Where:
is the number of test pairs involving the source A
is the number of test pairs involving the source B
This score is used to update .
A new and are calculated following all the previous steps but using the updated
Finally, the obtained are adjusted by Benjamini-Hochberg correction.
- Parameters:
data –
anndata.AnnDatainstance,pandas.DataFrame, or a tuple of(matrix, samples, features). All methods assume that input values follow a normal distribution unless otherwise specified. Therefore, when working with observational count data, some form of normalization is required (e.g.,scanpy’s library-size normalization followed by log1p). Using raw integer counts is not recommended, as they follow a Poisson distribution.Feature scaling on normalized counts is also acceptable, but note that it changes the results by assuming equal importance across features, and outcomes will vary depending on which observations are included.
No normalization or transformation is required when using contrast-level feature statistics such as log fold changes or Wald test statistics.
net – Dataframe in long format. Must include
sourceandtargetcolumns, and optionally aweightcolumn.tmin (default:
5) – Minimum number of targets per source. Sources with fewer targets will be removed.layer – Layer key name of an
anndata.AnnDatainstance.raw (default:
False) – Whether to use the.rawattribute ofanndata.AnnData.empty (default:
True) – Whether to remove empty observations (rows) or features (columns).bsize (default:
250000) – For large datasets in sparse format, this parameter controls how many observations are processed at once. Increasing this value speeds up computation but uses more memory.verbose (default:
False) – Whether to display progress messages and additional execution details.pleiotropy – Whether correction for pleiotropic regulation should be performed.
reg_sign – If
pleiotropy, p-value threshold for considering significant regulators.n_targets – If
pleiotropy, integer indicating the minimal number of overlaping targets to consider for analysis.penalty – If
pleiotropy, number higher than 1 indicating the penalty for the pleiotropic interactions. 1 = no penalty.
- Returns:
Enrichment scores and, if applicable, adjusted by Benjamini-Hochberg.
Example
import decoupler as dc adata, net = dc.ds.toy() dc.mt.viper(adata, net, tmin=3)