Conversion to other organisms

Most of the prior knowledge stored inside OmniPath is derived from human data, therefore they use human gene names. Despite this, using homology we can convert gene names to other organisms.

Here is the list of available organisms:

[1]:
import decoupler as dc

dc.show_organisms()
[1]:
['anole_lizard',
 'c.elegans',
 'cat',
 'cattle',
 'chicken',
 'chimpanzee',
 'dog',
 'fruitfly',
 'horse',
 'macaque',
 'mouse',
 'opossum',
 'pig',
 'platypus',
 'rat',
 's.cerevisiae',
 's.pombe',
 'xenopus',
 'zebrafish']

To showcase how to do it inside decoupler, we will load the SIGNOR database and convert it into gene symbols for mouse and fly.

[2]:
net = dc.get_resource('SIGNOR')
net
[2]:
genesymbol pathway
0 ABL1 Cell cycle: G2/M phase transition
1 ACE Focal segmental glomerulosclerosis
2 ACTB Axon guidance
3 ACTN1 Axon guidance
4 ACTN1 Glutamatergic synapse
... ... ...
2943 YY1 NOTCH Signaling
2944 ZAP70 T cell activation
2945 ZAP70 P38 Signaling
2946 ZBTB16 Acute Myeloid Leukemia
2947 ZFYVE9 TGF-beta Signaling

2948 rows × 2 columns

Then, we can easily transform the obtained resource into mouse genes:

[3]:
# Translate targets
m_net = dc.translate_net(
    net,
    target_organism='mouse',
)
m_net
/home/badi/miniforge3/envs/dcp/lib/python3.11/site-packages/liana/resource/_orthology.py:199: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
[3]:
genesymbol pathway
0 Abl1 Cell cycle: G2/M phase transition
1 Ace Focal segmental glomerulosclerosis
2 Actn1 Axon guidance
3 Actn1 Glutamatergic synapse
4 Acvr1b Multiple sclerosis
... ... ...
1386 Yy1 NOTCH Signaling
1387 Zap70 T cell activation
1388 Zap70 P38 Signaling
1389 Zbtb16 Acute Myeloid Leukemia
1390 Zfyve9 TGF-beta Signaling

1391 rows × 2 columns

Note that when performing homology convertion we might gain or lose some genes from one organism to another.

Let us try the fruit fly now:

[4]:
# Translate targets
f_net = dc.translate_net(
    net,
    target_organism='fruitfly',
)
f_net
/home/badi/miniforge3/envs/dcp/lib/python3.11/site-packages/liana/resource/_orthology.py:199: DtypeWarning: Columns (0,7) have mixed types. Specify dtype option on import or set low_memory=False.
[4]:
genesymbol pathway
0 Abl Cell cycle: G2/M phase transition
1 Actn Axon guidance
2 Actn Glutamatergic synapse
3 babo Multiple sclerosis
4 put Multiple sclerosis
... ... ...
725 Diap2 Death Receptor Signaling
726 Diap2 Inhibition of Apoptosis
727 yki Hippo Signaling
728 14-3-3zeta SAPK/JNK Signaling
729 Sara TGF-beta Signaling

730 rows × 2 columns

Additionaly, all OmniPath functions directly accept the parametter organism:

[5]:
dc.get_resource('SIGNOR', organism='zebrafish')
/home/badi/miniforge3/envs/dcp/lib/python3.11/site-packages/liana/resource/_orthology.py:199: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
[5]:
genesymbol pathway
0 abl1 Cell cycle: G2/M phase transition
1 ace Focal segmental glomerulosclerosis
2 actn1 Axon guidance
3 actn1 Glutamatergic synapse
4 adsl Nucleotide Biosynthesis
... ... ...
772 xiap Inhibition of Apoptosis
773 yap1 Hippo Signaling
774 zap70 T cell activation
775 zap70 P38 Signaling
776 zfyve9a TGF-beta Signaling

777 rows × 2 columns

[6]:
dc.get_progeny(organism='anole_lizard')
[6]:
source target weight p_value
0 Androgen tmprss2 11.490631 0.0
1 Androgen nkx3-1 10.622551 0.0
2 Androgen mboat2 10.472733 0.0
3 Androgen SLC38A4 7.363805 0.0
4 Androgen mtmr9 6.130646 0.0
... ... ... ... ...
153279 p53 tagap -0.000638 0.998829
153280 p53 MALRD1 -0.000406 0.999307
153281 p53 LOC100559499 -0.000291 0.999398
153282 p53 gsk3a -0.00006 0.999796
153283 p53 gng4 0.000089 0.99988

153284 rows × 4 columns

[7]:
dc.get_collectri(organism='s.cerevisiae')
[7]:
source target weight pmid
0 SPT15 TOA1 1 10078202;10523649;10581267;10617594;10675336;1...
1 TOA1 SPT15 1 12818428
2 MOT1 SPT15 -1 14988402;15509807;16858867;20627952
3 SPT15 MOT1 1 10082549
4 SPT15 DFR1 1 10336493
... ... ... ... ...
208 HAP2 SSA4 1 <NA>
209 HAP3 SSA4 1 <NA>
210 HAP3 SSA4 1 <NA>
211 HAP5 SSA4 1 <NA>
212 HAP5 SSA4 1 <NA>

213 rows × 4 columns