Access ontologies: cell type, tissue, disease, phenotype#
When it comes to ontology defined vocabularies, such as cell type, tissue, disease, and phenotype, the entity class extends to have the ontology accessible via {entity}.ontology
import bionty as bt
All available ontologies and their versions can be printed with:
bt.display_available_versions()
Available versions ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━┳━━━━━━━━┳━━━━━━━┓ ┃ Ontology ┃ URL ┃ Bion… ┃ Datab… ┃ All … ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━╇━━━━━━━━╇━━━━━━━┩ │ Ensembl │ https://www.ensembl.org/index.html │ Spec… │ ensem… │ rele… │ │ │ │ │ │ │ │ Ensembl │ https://www.ensembl.org/index.html │ Gene │ ensem… │ rele… │ │ │ │ │ │ │ │ Ensembl │ https://www.ensembl.org/index.html │ Gene │ ensem… │ rele… │ │ │ │ │ │ rele… │ │ │ │ │ │ │ │ Uniprot │ https://www.uniprot.org/ │ Prot… │ unipr… │ 2022… │ │ │ │ │ │ │ │ Uniprot │ https://www.uniprot.org/ │ Prot… │ unipr… │ 2022… │ │ │ │ │ │ 2022… │ │ │ │ │ │ │ │ CellMarker │ http://bio-bigdata.hrbmu.edu.cn/CellMarker/ │ Cell… │ cellm… │ 2.0 │ │ │ │ │ │ │ │ Cell Line Ontology │ https://bioportal.bioontology.org/ontologies/CLO │ Cell… │ clo │ 2022… │ │ │ │ │ │ │ │ Cell Ontology │ https://obophenotype.github.io/cell-ontology/ │ Cell… │ cl │ 2023… │ │ │ │ │ │ │ │ Cell Ontology │ https://obophenotype.github.io/cell-ontology/ │ Cell… │ cl │ 2023… │ │ │ │ │ │ 2022… │ │ │ │ │ │ │ │ Human Cell Atlas Ontology │ https://github.com/HumanCellAtlas/ontology │ Cell… │ ca │ 2022… │ │ │ │ │ │ │ │ Uberon multi-species anatomy … │ http://obophenotype.github.io/uberon/ │ Tiss… │ uberon │ 2023… │ │ │ │ │ │ │ │ Uberon multi-species anatomy … │ http://obophenotype.github.io/uberon/ │ Tiss… │ uberon │ 2023… │ │ │ │ │ │ 2022… │ │ │ │ │ │ │ │ Mondo Disease Ontology │ https://mondo.monarchinitiative.org/ │ Dise… │ mondo │ 2023… │ │ │ │ │ │ │ │ Mondo Disease Ontology │ https://mondo.monarchinitiative.org/ │ Dise… │ mondo │ 2023… │ │ │ │ │ │ 2022… │ │ │ │ │ │ │ │ Human Disease Ontology │ https://disease-ontology.org/ │ Dise… │ doid │ 2023… │ │ │ │ │ │ │ │ The Experimental Factor Ontol… │ https://bioportal.bioontology.org/ontologies/EFO │ Read… │ efo │ 3.48… │ │ │ │ │ │ │ │ Human Phenotype Ontology │ https://hpo.jax.org/ │ Phen… │ hp │ 2023… │ │ │ │ │ │ │ │ Pathway Ontology │ https://www.ebi.ac.uk/ols/ontologies/pw │ Path… │ pw │ 7.78 │ │ │ │ │ │ │ │ Pathway Ontology │ https://www.ebi.ac.uk/ols/ontologies/pw │ Path… │ pw │ 7.78 │ │ │ │ │ │ 7.74 │ │ │ │ │ │ │ │ Bioinformatics Pipeline │ https://lamin.ai │ BFXP… │ lamin │ 1.0.0 │ │ │ │ │ │ │ │ Drug Ontology │ https://bioportal.bioontology.org/ontologies/DRON/?p… │ Drug │ dron │ 2023… │ │ │ │ │ │ │ └────────────────────────────────┴───────────────────────────────────────────────────────┴───────┴────────┴───────┘
The currently used versions can be shown with:
bt.display_active_versions()
Currently used versions in ._current.yaml ┏━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━┓ ┃ Bionty class ┃ Database ┃ Version ┃ ┡━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━┩ │ Species │ ensembl │ release-108 │ │ Gene │ ensembl │ release-108 │ │ Protein │ uniprot │ 2022-04 │ │ CellMarker │ cellmarker │ 2.0 │ │ CellLine │ clo │ 2022-03-21 │ │ CellType │ cl │ 2023-02-15 │ │ Tissue │ uberon │ 2023-02-14 │ │ Disease │ mondo │ 2023-02-06 │ │ Readout │ efo │ 3.48.0 │ │ Phenotype │ hp │ 2023-01-27 │ │ Pathway │ pw │ 7.78 │ │ BFXPipeline │ lamin │ 1.0.0 │ │ Drug │ dron │ 2023-03-10 │ └──────────────┴────────────┴─────────────┘
Cell Type#
Here we look at cell type as an example:
ct = bt.CellType()
df = ct.df()
df.head()
name | children | |
---|---|---|
ontology_id | ||
CL:0000000 | cell | [CL:0000003, CL:0001061, CL:0001034] |
CL:0000001 | primary cultured cell | None |
CL:0000002 | obsolete immortal cell line cell | None |
CL:0000003 | native cell | [CL:0007001, CL:0000891, CL:0000630, CL:402900... |
CL:0000004 | obsolete cell by organism | None |
again you may look up the vocabulary using .lookup by tab completion
lookup = bt.CellType().lookup()
lookup.astrocyte
cell_type(ontology_id='CL:0000127', name='astrocyte', children=array(['CL:0002627', 'CL:0000683', 'CL:0002626', 'CL:0002605',
'CL:0012000', 'CL:0000645', 'CL:0000644', 'CL:0002606',
'CL:0002603', 'CL:0002604'], dtype=object))
Cell type ontology is accessible via pronto Ontology as .ontology
ct.ontology
Ontology('/home/runner/work/bionty/bionty/.nox/build-package-bionty/lib/python3.9/site-packages/bionty/_dynamic/human___cl___2023-02-15___CellType', timeout=100)
len(ct.ontology.terms())
4168
term = ct.ontology["CL:0000128"]
term.definition
Definition('A class of large neuroglial (macroglial) cells in the central nervous system. Form the insulating myelin sheath of axons in the central nervous system.', xrefs={Xref('MESH:D009836'), Xref('http://en.wikipedia.org/wiki/Oligodendrocyte')})
term.is_leaf()
True
tissue, disease, and phenotype work similary
Tissue#
tissue = bt.Tissue()
df = tissue.df()
df.head()
name | children | |
---|---|---|
ontology_id | ||
UBERON:0000000 | processual entity | [UBERON:0000104, UBERON:0000105, UBERON:0035943] |
UBERON:0000002 | uterine cervix | None |
UBERON:0000003 | naris | [UBERON:0005928, UBERON:0010425, UBERON:0005931] |
UBERON:0000004 | nose | None |
UBERON:0000005 | chemosensory organ | [UBERON:0003212] |
lookup = tissue.lookup()
lookup.kidney
tissue(ontology_id='UBERON:0002113', name='kidney', children=array(['UBERON:0004538', 'UBERON:0000081', 'UBERON:0004539',
'UBERON:0002120', 'UBERON:0000080', 'UBERON:0000082'], dtype=object))
Disease#
disease = bt.Disease()
df = disease.df()
df.head()
name | children | |
---|---|---|
ontology_id | ||
BFO:0000001 | entity | [BFO:0000003, BFO:0000002] |
BFO:0000002 | continuant | [BFO:0000020, BFO:0000031, BFO:0000004] |
BFO:0000003 | occurrent | [BFO:0000015, HsapDv:0000144, UBERON:0000066, ... |
BFO:0000004 | independent continuant | [UBERON:0001062, BFO:0000040, BFO:0000141, CAR... |
BFO:0000006 | spatial region | None |
lookup = disease.lookup()
lookup.chronic_kidney_disease
disease(ontology_id='MONDO:0005300', name='chronic kidney disease', children=array(['MONDO:0004375', 'MONDO:0001110', 'MONDO:0005016', 'MONDO:0024327',
'MONDO:0001184'], dtype=object))
Phenotype#
phenotype = bt.Phenotype()
df = phenotype.df()
df.head()
name | children | |
---|---|---|
ontology_id | ||
BFO:0000001 | entity | [BFO:0000003, BFO:0000002] |
BFO:0000002 | continuant | [BFO:0000020, BFO:0000004] |
BFO:0000003 | occurrent | [BFO:0000015, UBERON:0000066, UBERON:0000106, ... |
BFO:0000004 | independent continuant | [CL:0010007, UBERON:0000061, CARO:0030000, UBE... |
BFO:0000006 | spatial region | None |
lookup = phenotype.lookup()
lookup.cerebral_nerve_fasciculus
phenotype(ontology_id='UBERON:0022248', name='cerebral nerve fasciculus', children=None)
Readout#
Readout
parses Experimental Factor Ontology to the following categories for describing biological experiments:
efo_id
name
molecule
instrument
measurement
The columns are reflected in the readout
table in lnschema-wetlab.
readout = bt.Readout()
df = readout.df()
df.head()
ontology_id | name | |
---|---|---|
0 | EFO:0000001 | experimental factor |
1 | EFO:0000002 | CS57511 |
2 | EFO:0000003 | CS57512 |
3 | EFO:0000004 | CS57515 |
4 | EFO:0000005 | CS57520 |
Search for a molecular readout:
readout.get("EFO:0010891")
{'ontology_id': 'EFO:0010891',
'name': 'scATAC-seq',
'molecule': 'DNA assay',
'instrument': 'assay by high throughput sequencer',
'measurement': None}
Searching for a non-molecular readout:
readout.get("EFO:0004134")
{'ontology_id': 'EFO:0004134',
'name': 'tumor size',
'molecule': None,
'instrument': None,
'measurement': 'tumor size'}