Conversion to other organisms
Most of the prior knowledge stored inside OmniPath is derived from human data, therefore they use human gene names. Despite this, using homology we can convert gene names to other organisms.
Here is the list of available organisms:
[1]:
import decoupler as dc
dc.show_organisms()
[1]:
['anole_lizard',
'c.elegans',
'cat',
'cattle',
'chicken',
'chimpanzee',
'dog',
'fruitfly',
'horse',
'macaque',
'mouse',
'opossum',
'pig',
'platypus',
'rat',
's.cerevisiae',
's.pombe',
'xenopus',
'zebrafish']
To showcase how to do it inside decoupler, we will load the SIGNOR database and convert it into gene symbols for mouse and fly.
[2]:
net = dc.get_resource('SIGNOR')
net
[2]:
| genesymbol | pathway | |
|---|---|---|
| 0 | ABL1 | Cell cycle: G2/M phase transition |
| 1 | ACE | Focal segmental glomerulosclerosis |
| 2 | ACTB | Axon guidance |
| 3 | ACTN1 | Axon guidance |
| 4 | ACTN1 | Glutamatergic synapse |
| ... | ... | ... |
| 2943 | YY1 | NOTCH Signaling |
| 2944 | ZAP70 | T cell activation |
| 2945 | ZAP70 | P38 Signaling |
| 2946 | ZBTB16 | Acute Myeloid Leukemia |
| 2947 | ZFYVE9 | TGF-beta Signaling |
2948 rows × 2 columns
Then, we can easily transform the obtained resource into mouse genes:
[3]:
# Translate targets
m_net = dc.translate_net(
net,
target_organism='mouse',
)
m_net
/home/badi/miniforge3/envs/dcp/lib/python3.11/site-packages/liana/resource/_orthology.py:199: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
[3]:
| genesymbol | pathway | |
|---|---|---|
| 0 | Abl1 | Cell cycle: G2/M phase transition |
| 1 | Ace | Focal segmental glomerulosclerosis |
| 2 | Actn1 | Axon guidance |
| 3 | Actn1 | Glutamatergic synapse |
| 4 | Acvr1b | Multiple sclerosis |
| ... | ... | ... |
| 1386 | Yy1 | NOTCH Signaling |
| 1387 | Zap70 | T cell activation |
| 1388 | Zap70 | P38 Signaling |
| 1389 | Zbtb16 | Acute Myeloid Leukemia |
| 1390 | Zfyve9 | TGF-beta Signaling |
1391 rows × 2 columns
Note that when performing homology convertion we might gain or lose some genes from one organism to another.
Let us try the fruit fly now:
[4]:
# Translate targets
f_net = dc.translate_net(
net,
target_organism='fruitfly',
)
f_net
/home/badi/miniforge3/envs/dcp/lib/python3.11/site-packages/liana/resource/_orthology.py:199: DtypeWarning: Columns (0,7) have mixed types. Specify dtype option on import or set low_memory=False.
[4]:
| genesymbol | pathway | |
|---|---|---|
| 0 | Abl | Cell cycle: G2/M phase transition |
| 1 | Actn | Axon guidance |
| 2 | Actn | Glutamatergic synapse |
| 3 | babo | Multiple sclerosis |
| 4 | put | Multiple sclerosis |
| ... | ... | ... |
| 725 | Diap2 | Death Receptor Signaling |
| 726 | Diap2 | Inhibition of Apoptosis |
| 727 | yki | Hippo Signaling |
| 728 | 14-3-3zeta | SAPK/JNK Signaling |
| 729 | Sara | TGF-beta Signaling |
730 rows × 2 columns
Additionaly, all OmniPath functions directly accept the parametter organism:
[5]:
dc.get_resource('SIGNOR', organism='zebrafish')
/home/badi/miniforge3/envs/dcp/lib/python3.11/site-packages/liana/resource/_orthology.py:199: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.
[5]:
| genesymbol | pathway | |
|---|---|---|
| 0 | abl1 | Cell cycle: G2/M phase transition |
| 1 | ace | Focal segmental glomerulosclerosis |
| 2 | actn1 | Axon guidance |
| 3 | actn1 | Glutamatergic synapse |
| 4 | adsl | Nucleotide Biosynthesis |
| ... | ... | ... |
| 772 | xiap | Inhibition of Apoptosis |
| 773 | yap1 | Hippo Signaling |
| 774 | zap70 | T cell activation |
| 775 | zap70 | P38 Signaling |
| 776 | zfyve9a | TGF-beta Signaling |
777 rows × 2 columns
[6]:
dc.get_progeny(organism='anole_lizard')
[6]:
| source | target | weight | p_value | |
|---|---|---|---|---|
| 0 | Androgen | tmprss2 | 11.490631 | 0.0 |
| 1 | Androgen | nkx3-1 | 10.622551 | 0.0 |
| 2 | Androgen | mboat2 | 10.472733 | 0.0 |
| 3 | Androgen | SLC38A4 | 7.363805 | 0.0 |
| 4 | Androgen | mtmr9 | 6.130646 | 0.0 |
| ... | ... | ... | ... | ... |
| 153279 | p53 | tagap | -0.000638 | 0.998829 |
| 153280 | p53 | MALRD1 | -0.000406 | 0.999307 |
| 153281 | p53 | LOC100559499 | -0.000291 | 0.999398 |
| 153282 | p53 | gsk3a | -0.00006 | 0.999796 |
| 153283 | p53 | gng4 | 0.000089 | 0.99988 |
153284 rows × 4 columns
[7]:
dc.get_collectri(organism='s.cerevisiae')
[7]:
| source | target | weight | pmid | |
|---|---|---|---|---|
| 0 | SPT15 | TOA1 | 1 | 10078202;10523649;10581267;10617594;10675336;1... |
| 1 | TOA1 | SPT15 | 1 | 12818428 |
| 2 | MOT1 | SPT15 | -1 | 14988402;15509807;16858867;20627952 |
| 3 | SPT15 | MOT1 | 1 | 10082549 |
| 4 | SPT15 | DFR1 | 1 | 10336493 |
| ... | ... | ... | ... | ... |
| 208 | HAP2 | SSA4 | 1 | <NA> |
| 209 | HAP3 | SSA4 | 1 | <NA> |
| 210 | HAP3 | SSA4 | 1 | <NA> |
| 211 | HAP5 | SSA4 | 1 | <NA> |
| 212 | HAP5 | SSA4 | 1 | <NA> |
213 rows × 4 columns