Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The . gov means it’s official. Federal government websites often end in . gov or . mil VSports app下载. Before sharing sensitive information, make sure you’re on a federal government site. .

Https

The site is secure V体育官网. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely. .

Comparative Study
. 2000 Aug;10(8):1204-10.
doi: 10.1101/gr.10.8.1204.

VSports手机版 - Predicting protein function by genomic context: quantitative evaluation and qualitative inferences

Affiliations
Comparative Study

Predicting protein function by genomic context: quantitative evaluation and qualitative inferences

M Huynen (VSports注册入口) et al. Genome Res. 2000 Aug.

VSports手机版 - Abstract

Various new methods have been proposed to predict functional interactions between proteins based on the genomic context of their genes. The types of genomic context that they use are Type I: the fusion of genes; Type II: the conservation of gene-order or co-occurrence of genes in potential operons; and Type III: the co-occurrence of genes across genomes (phylogenetic profiles). Here we compare these types for their coverage, their correlations with various types of functional interaction, and their overlap with homology-based function assignment. We apply the methods to Mycoplasma genitalium, the standard benchmarking genome in computational and experimental genomics. Quantitatively, conservation of gene order is the technique with the highest coverage, applying to 37% of the genes VSports手机版. By combining gene order conservation with gene fusion (6%), the co-occurrence of genes in operons in absence of gene order conservation (8%), and the co-occurrence of genes across genomes (11%), significant context information can be obtained for 50% of the genes (the categories overlap). Qualitatively, we observe that the functional interactions between genes are stronger as the requirements for physical neighborhood on the genome are more stringent, while the fraction of potential false positives decreases. Moreover, only in cases in which gene order is conserved in a substantial fraction of the genomes, in this case six out of twenty-five, does a single type of functional interaction (physical interaction) clearly dominate (>80%). In other cases, complementary function information from homology searches, which is available for most of the genes with significant genomic context, is essential to predict the type of interaction. Using a combination of genomic context and homology searches, new functional features can be predicted for 10% of M. genitalium genes. .

PubMed Disclaimer

Figures

Figure 1
Figure 1
(a) Coverage of and overlap between various types of genomic context for M. genitalium genes. Type I is gene-fusion. Type II is the conservation local gene neighborhood, which is separated in type IIa (the conservation of gene order) and type IIb (the co-occurrence of genes within potential operons in absence of the conservation of gene order). Type III is the co-occurrence of genes in genomes. (b) Overlap between genes for which significant genomic context is available and genes for which functional features can be predicted by homology searches. For the latter, only genes that are homologous to genes with known molecular functions were included, which were determined by manual inspection. The dark gray areas in the figure are genes for which new functional features can be predicted by genomic context. They can be homologous to proteins with a known molecular function, in which case the context can indicate in which process this function plays a role (see text for specific examples). A complete list of genes for which new functional features could be predicted by genomic context and, if available, homology to proteins with known function, is available from http://dove.embl-heidelberg.de/MG/Context.
Figure 1
Figure 1
(a) Coverage of and overlap between various types of genomic context for M. genitalium genes. Type I is gene-fusion. Type II is the conservation local gene neighborhood, which is separated in type IIa (the conservation of gene order) and type IIb (the co-occurrence of genes within potential operons in absence of the conservation of gene order). Type III is the co-occurrence of genes in genomes. (b) Overlap between genes for which significant genomic context is available and genes for which functional features can be predicted by homology searches. For the latter, only genes that are homologous to genes with known molecular functions were included, which were determined by manual inspection. The dark gray areas in the figure are genes for which new functional features can be predicted by genomic context. They can be homologous to proteins with a known molecular function, in which case the context can indicate in which process this function plays a role (see text for specific examples). A complete list of genes for which new functional features could be predicted by genomic context and, if available, homology to proteins with known function, is available from http://dove.embl-heidelberg.de/MG/Context.
Figure 2
Figure 2
The types of functional interactions between M. genitalium proteins for the different types of genomic context. The surface areas of the circles are proportional to the number of genes for which the techniques apply. Classification was done by manual inspection, allowing detection of all possible described functional interactions between proteins. Subsequently, the functional interactions were divided along the following hierarchical classification: 1:  direct physical interaction between the proteins2:  indirect physical interaction (i.e., the proteins are part of the same protein complex, but there is no evidence that they interact directly with each other)3:  the proteins are part of a single metabolic pathway4:  the proteins are part of a non-metabolic pathway, either regulatory or otherwise5:  the proteins take part in the same process6:  pairs of proteins of which at least one is hypothetical7:  proteins with known functions between which no functional interactions are known Class 5 was only considered if the functional interactions between the proteins did not fall in classes 1–4. Types of functional interactions can be counted in two ways: per gene and per interaction. In general, the number of interactions is smaller than the number of genes, as two interacting genes only represent a single interaction. However, a single gene can have multiple genomic associations: In those cases they were normalized per gene. The results in the figure are based on a per gene count. The frequencies of the different classes of functional interactions did not alter significantly upon counting each interaction.
Figure 3
Figure 3
Genomic context predicts substrate specificity of proteins involved in a nucleoside salvage pathway in M. genitalium. A cluster of five genes in M. genitalium encodes four genes of a nucleoside salvage pathway. The “standard” gene for this fifth reaction in the pathway, phosphoribomutase (deoB), is absent. The fifth gene in the operon is homologous to phosphomannomutases and phosphoglucomutases. M. genitalium does not contain any other candidate for a phosphoribomutase. The most likely candidate for the phosphoribomutase is thus MG053. The significance of the location of a homolog of MG053 in a run with deoD is supported by the location of a homolog of the M. genitalium gene MG053 beside deoD in Mycobacterium tuberculosis.
Figure 4
Figure 4
Domain organization of two proteins that are encoded by neighboring genes on B. subtilis (ymdA and ymdB) and B. burgdorferi (BB0504 and BB0505), and that are both present in M. genitalium (MG130 and MG246). The three domains that have functionally been characterized, KH, HD, and 5′NT, can all be related to ribonucleotide metabolism. KH binds (single-stranded) RNA; HD hydrolyzes phosphates from nucleotides; and 5′NT hydrolyzes NMP to nucleosides. A fourth, uncharacterized sequence domain (DUF) is present at C-terminus of MG246 and its orthologs.

References

    1. Alam KY, Clark DP. Molecular cloning and sequence of the thdf gene, which is involved in thiophene and furan oxidation by Escherichia coli. J Bacteriol. 1991;173:6018–6024. - PMC - PubMed
    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped blast and psi-blast: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Aravind L, Koonin E. The HD domain defines a new superfamily of metal-dependent phosphohydrolases. Trends Biochem Sci. 1998;23:469–472. - PubMed
    1. Bork P, Dandekar T, Diaz-Lazcoz Y, Eisenhaber F, Huynen M, Yuan Y. Predicting function: From genes to genomes and back. J Mol Biol. 1998;283:707–725. - V体育官网入口 - PubMed
    1. Brenner SE. Errors in genome annotation. Trends Genet. 1999;15:132–133. - PubMed

Substances (VSports)

LinkOut - more resources