The Integrated Mitochondrial Protein Index (IMPI) is a curated collection of human genes encoding proteins with evidence for associating with mitochondria and affecting their form and function. The aim of IMPI is to define a mitochondrial proteome for studying mitochondrial (dys)function and disease, including during DNA sequencing of patients with mitochondrial disease.
The IMPI collection has two parts:
- A curated collection of genes encoding mitochondrial localised/associated/ancillary proteins with supporting PubMed citations
- (Current release: 1,966 genes with over 2,300 citations)
- A collection of novel predicted mitochondrial proteins with highly significant evidence of mitochondrial localisation, but unreported in the literature as mitochondrial
- (Current release: 119 genes)
The predicted portion is created and updated by using machine learning on multiple types of evidence, including novel data types previously unused in defining the mitochondrial proteome, such as presence in manually curated metabolic models, antibody data from the Human Protein Atlas, conservation of predictions from four mitochondrial targeting sequence programs across four species, extensive experimental data from over 70 GFP and mass spectrometry localisation studies, protein-protein interactions, enrichment of mitochondrial RNAs binding CLUH, homologs in the amitochondriate Monocercomonoides sp., and literature/knowledge mining. Evidence is shared among orthologs of organisms by using the mapping of Ensembl's COMPARA.
To determine if proteins are likely mitochondrial, a machine learning classifier (support vector machines) is used after training on collections of known mitochondrial and non-mitochondrial proteins. The machine learning algorithm then identifies proteins from the human genome having similar properties and evidence to previously characterised mitochondrial proteins.
All predictions are checked manually for literature supporting a mitochondrial localisation, and if confirming publications are found, then entries are moved to the IMPI curated collection, and machine-learning re-run.
See also the companion dataset to IMPI of genes encoding proteins associated with mitochondrial disease and dysfunction: The Buddy Collection (see below).
Availability:
IMPI-2021-Q4pre: 2021 release of IMPI database including:
- 1,357 genes encoding verified mitochondrial proteins with "gold standard" evidence of mitochondrial localisation, e.g. visual conformation by GFP-tagging;
- 127 encoding associated mitochondrial proteins with evidence of mitochondrial localisation, but lacking visual confirmation, e.g. APEX tagging;
- 482 encoding ancillary mitochondrial proteins with no evidence of mitochondrial localisation, but reported to affect mitochondrial function or morphology; and
- 117 encoding predicted mitochondrial proteins not reported as mitochondrial or affecting mitochondrial function, but with highly significant evidence for mitochondrial localisation by statistical analysis of experimental data.
IMPI-2020-Q3pre: 2020 release of IMPI database including:
- 1,330 genes encoding proteins reported to have a mitochondrial localisation;
- 328 reported to affect mitochondrial function, morphology and dynamics, but lacking conclusive localisation; and
- 511 not reported as mitochondrial or affecting mitochondrial function, but with strong evidence for mitochondrial localisation by statistical analysis of experimental data.
IMPI-2018-Q2: 2018 release of IMPI database including:
- 1,184 genes encoding proteins reported to have a mitochondrial localisation; and
- 442 predicted from experimental data, encoding proteins with strong evidence for mitochondrial localisation.
Resources
IMPI can be accessed from MitoMiner in a user-friendly interface.
Alternatively you can download the whole data file here in Excel format:
The Buddy Collection
The Buddy collection is a companion dataset to the Integrated Mitochondrial Proteome Index (IMPI). It is a manual curation of about 2000 human genes encoding mitochondrial proteins for evidence of their association with disease and mitochondrial dysfunction, in patients, model organisms and cell lines. It also includes results from genome-wide screens investigating aspects of mitochondrial dysfunction.
A simple evaluation of each gene was made and its dysfunction was classified as causing:
N.B. No distinction is made between mitochondrial dysfunction as a cause or consequence of a genetic disease, i.e. primary versus secondary. For a fuller evaluation of selected mitochondrial disease genes, please visit Mitochondrial disorders of the Neuromuscular Disease Centre of Washington University.
Curation and annotation resources:
- Mitochondrial disease. Human disorder reported as a "mitochondrial disease" (342 genes with 604 citations)
- Disorder with mitochondrial dysfunction. Human disorder with mitochondrial dysfunction reported in patient (133 genes with 334 citations)
- Disorder with likely mitochondrial dysfunction. Human disorder with mitochondrial dysfunction reported in model organism or cell line of disorder, but not patient (120 genes with 320 citations)
- Disorder with possible mitochondrial dysfunction. Human disorder with mitochondrial dysfunction reported in model organism or cell line, but not patient (76 genes with over 178 citations)
- Disorder. Human disorder with no reported mitochondrial dysfunction in patient, model organism, or cell line (176 genes)
- Susceptibility. Associated with susceptibility to a human disorder, e.g. obesity (25 genes)
- Cancer. Disorder is a cancer/tumour-type disease, e.g. paragangliomas (41 genes)
- A further 581 genes (with about 700 citations) have reports of being associated with mitochondrial dysfunction in model organisms and cell lines, but no association with a disorder, and are candidates for new mitochondrial disease genes.
- PubMed
- Online Mendelian Inheritance in Man
- Mitochondrial disorders of Neuromuscular Disease Centre of Washington University
- Mitochondrial energy generation disorders: genes, mechanisms, and clues to pathology
- Mitochondrial disease in children
- Genetics of mitochondrial diseases: Identifying mutations to help diagnosis
- Mitochondrial Disease Genes Compendium: From Genes to Clinical Manifestations (ISBN-10:0128200294)]
- GenomicsEngland MitoPanel: Gene reported as mitochondrial disease on panel
- Genes essential in all cancer cell lines of the Achilles project on Cancer Dependency Map Portal (DepMap) [ https://doi.org/10.6084/m9.figshare.14541774.v2 ]
- A high-throughput screen of real-time ATP levels in individual cells reveals mechanisms of energy failure
- A genome-wide CRISPR death screen identifies genes essential for oxidative phosphorylation
- Genetic screen for cell fitness in high or low oxygen highlights mitochondrial and lipid metabolism
- A mitochondrial RNAi screen defines cellular bioenergetic determinants and identifies an adenylate kinase as a key regulator of ATP levels
- Genome-wide CRISPR/Cas9 deletion screen defines mitochondrial gene essentiality and identifies routes for tumour cell viability in hypoxia
- A compendium of genetic modifiers of mitochondrial dysfunction reveals intra-organelle buffering
- A genome-wide shRNA screen for new OxPhos related genes
- Pathogenic variants in the NCBI ClinVar database
Availability:
Current version: Buddy-2021Q4