decoupler.pp.filter_by_prop

Contents

decoupler.pp.filter_by_prop#

decoupler.pp.filter_by_prop(adata, min_prop=0.2, min_smpls=2, inplace=True)#

Determine which genes are expressed in a sufficient proportion of cells across samples.

This function selects genes that are sufficiently expressed across cells in each sample and that this condition is met across a minimum number of samples.

Parameters:
  • adata (AnnData) – Annotated data matrix with observations (rows) and features (columns).

  • min_prop (float (default: 0.2)) – Minimum proportion of cells that express a gene in a sample.

  • min_smpls (int (default: 2)) – Minimum number of samples with bigger or equal proportion of cells with expression than min_prop.

  • inplace (bool (default: True)) – Whether to perform the operation in the same object.

Return type:

None | ndarray

Returns:

If inplace=False, array of genes to be kept.

Example

import decoupler as dc

adata = dc.ds.covid5k()
pdata = dc.pp.pseudobulk(adata, sample_col="individual", groups_col="celltype")
dc.pp.filter_by_prop(pdata)