"VSports最新版本" Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The . gov means it’s official VSports app下载. Federal government websites often end in . gov or . mil. Before sharing sensitive information, make sure you’re on a federal government site. .

Https

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely V体育官网. .

. 2009 Jan;5(1):e1000344.
doi: 10.1371/journal.pgen.1000344. Epub 2009 Jan 23.

V体育安卓版 - Organised genome dynamics in the Escherichia coli species results in highly diverse adaptive paths

Affiliations

Organised genome dynamics in the Escherichia coli species results in highly diverse adaptive paths (VSports最新版本)

VSports最新版本 - Marie Touchon et al. PLoS Genet. 2009 Jan.

Abstract

The Escherichia coli species represents one of the best-studied model organisms, but also encompasses a variety of commensal and pathogenic strains that diversify by high rates of genetic change. We uniformly (re-) annotated the genomes of 20 commensal and pathogenic E. coli strains and one strain of E. fergusonii (the closest E. coli related species), including seven that we sequenced to completion. Within the approximately 18,000 families of orthologous genes, we found approximately 2,000 common to all strains. Although recombination rates are much higher than mutation rates, we show, both theoretically and using phylogenetic inference, that this does not obscure the phylogenetic signal, which places the B2 phylogenetic group and one group D strain at the basal position. Based on this phylogeny, we inferred past evolutionary events of gain and loss of genes, identifying functional classes under opposite selection pressures. We found an important adaptive role for metabolism diversification within group B2 and Shigella strains, but identified few or no extraintestinal virulence-specific genes, which could render difficult the development of a vaccine against extraintestinal infections. Genome flux in E VSports手机版. coli is confined to a small number of conserved positions in the chromosome, which most often are not associated with integrases or tRNA genes. Core genes flanking some of these regions show higher rates of recombination, suggesting that a gene, once acquired by a strain, spreads within the species by homologous recombination at the flanking genes. Finally, the genome's long-scale structure of recombination indicates lower recombination rates, but not higher mutation rates, at the terminus of replication. The ensuing effect of background selection and biased gene conversion may thus explain why this region is A+T-rich and shows high sequence divergence but low sequence polymorphism. Overall, despite a very high gene flow, genes co-exist in an organised genome. .

PubMed Disclaimer

"VSports在线直播" Conflict of interest statement

The authors have declared that no competing interests exist.

"VSports在线直播" Figures

Figure 1
Figure 1. Escherichia coli core and pan-genome evolution according to the number of sequenced genomes.
Number of genes in common (left) and total number of non-orthologous genes (right) for a given number of genomes analysed for the different strains of E. coli. The upper and lower edges of the boxes indicate the first quartile (25th percentile of the data) and third quartile (75th percentile), respectively, of 1000 random different input orders of the genomes. The central horizontal line indicates the sample median (50th percentile). The central vertical lines extend from each box as far as the data extend, to a distance of at most 1.5 interquartile ranges (i.e., the distance between the first and third quartile values). At 20 sequenced genomes, the core-genome had 1976 genes (11% of the pan-genome), whereas the pan-genome had (i) 17 838 total genes (black), (ii) 11 432 genes (red) with no strong relation of homology (<80% similarity in sequence), and (iii) 10 131 genes (blue) after removing insertion sequence-like elements (3834, 21% of all genes) and prophage-like elements (3873, 22% of all genes).
Figure 2
Figure 2. Frequency of genes within the 20 analysed Escherichia coli genomes.
At one extreme of the x-axis are the genes present in a single genome which are regarded as strain specific genes (9054 genes: 51% of the pan-genome), while at the opposite end of the scale are situated the genes found in all 20 genomes, which represent the E. coli core-genome (1976 genes: 11% of the pan-genome). Coloured rectangles represent the proportion of insertion sequence (IS)-like elements (yellow), prophage-like elements (green), and genes of unknown/unclassified function (white). Black rectangles represent genes for which a function can be assigned. Strain-specific genes correspond to 2885 IS-like elements (32%), 2352 prophage-like elements (26%), and 3220 genes of unknown/unclassified function (35%).
Figure 3
Figure 3. Impact of gene conversion rate on phylogenetic reconstruction.
Sets of 20 sequences of 25 kbp were simulated 100 times under different gene conversion rates with constant tract length (50 bp) and mutation rate. The topology of the “true” genealogy of the sample (as inferred from a single nucleotide on which no gene conversion was allowed) was compared, using Robinson and Foulds distance, to the topology inferred from phylogenetic tree reconstruction using the simulated sequences. Error bars indicate one standard deviation variance, and horizontal bars represent one standard deviation variance from the no-gene-conversion model. A high rate of gene conversion is required to affect the topology of the reconstructed phylogeny. The observed average ratio of gene conversion to mutation (CGC/theta) is indicated by an arrow.
Figure 4
Figure 4. Maximum likelihood phylogenetic tree of the 20 Escherichia coli and Shigella strains as reconstructed from the sequences of the 1878 genes of the Escherichia core genome.
The earliest diverging species, E. fergusonii, was chosen to root the tree. The numbers at the nodes correspond, in black, to the bootstrap values (1000 bootstraps) and, in grey, to a “consensus strength”, which is the number of genes that confirms the bipartition (see Materials and Methods). The latter value is displayed only in instances where consensus and tested trees correspond. The branch length separating E. fergusonii from the E. coli strains is not to scale; the numbers above the branch indicate its length. Phylogenetic group membership of the strains is indicated with bars at the right of the figure.
Figure 5
Figure 5. Association between gene repertoire relatedness and phylogenetic distance.
The horizontal line corresponds to the average relatedness among Escherichia coli/Shigella strains. The log fit shows an R2 = 0.26 (p<0.01), which drops to R2 = 0.07 (p<0.01) if the points before the dashed line are removed.
Figure 6
Figure 6. Inferred gene content evolution in the lineage of Escherichia coli.
The cladogram shows the phylogenetic relationships among the 20 E. coli/Shigella genomes rooted on the E. fergusonii genome, as in Figure 4, but ignoring branch lengths. The major phylogenetic groups are indicated by the vertical lines. Each strain and internal node of the tree is labelled with the number of genes present (as inferred by maximum likelihood: see Materials and Methods). Coloured rectangles represent different gene classes within the gene repertoires of ancestral and modern E. coli. Rectangle widths are proportional to the number of genes. The four different gene classes (by colour) include genes that are: in the core genome (white), not clade-specific (grey), clade-specific but not ubiquitous in the clade (green) and both clade-specific and ubiquitous in the clade (yellow). A clade-specific gene is one that is inferred to be present only in the node and its descendent nodes.
Figure 7
Figure 7. Reconstruction of gains and losses of genes in the evolution of Escherichia coli.
The cladogram shows the phylogenetic relationships among the 20 E. coli/Shigella genomes rooted on the E. fergusonii genome, as in Figure 4, ignoring branch lengths for clarity. Each strain and internal node of the tree is labelled with the inferred numbers of genes gained (red: top) and lost (black: top) and the inferred numbers of corresponding events of gene acquisition (red: bottom) and loss (black: bottom) along the branch. Pie charts on each branch represent the functional classification of genes gained based on the colour-scale (details in the keys). The functional classes of known function genes are represented by numbers explained by a key in Supplementary Table 5. A similar figure, but displaying the pie charts for genes lost in the branch, is given in supplementary material (Figure S5).
Figure 8
Figure 8. Global view of insertion/deletion hot spots.
Number of genes (ranging from 0 to 200) in indels along the genomes of modern strains according to the ancestral gene order of the core genome. The numbers on the x-axis represent the order of genes in the core genome, which has the same order as E. coli K-12 MG1655.
Figure 9
Figure 9. The genomic island at the pheV tRNA insertion hot spot in the different Escherichia coli strains.
The figure provides a synthetic view of the pheV tRNA insertion hotspot in the different studied E. coli strains. This region has been defined using the synteny breaks among 12 E. coli strains. In E. coli K-12 MG1655, the genes immediately flanking the pheV tRNA gene are the ECK2960 gene (speC, ornithine decarboxylase) and the ECK2981 gene (pitB, phosphate transporter). In strain APEC O1, the pheV tRNA gene is absent. As most E. coli genomic regions have a composite structure, e.g., a region partially conserved or found in different synteny groups in other strains (i.e., at different genomic locations), we have manually divided this large genomic island into sub-regions (or modules), which are found in only a subset of the compared E. coli strains. Homologous modules have the same colour code and identifying number throughout. A total of 23 homologous modules were defined. The composition of these modules (i.e, the lists and functional descriptions of the constituent genes) is available in Supplementary Table 7. Black modules are strain-specific. Modules with hatched patterns correspond to repeated regions. Modules with grey dotted patterns are found in other strains but at another genomic location. The pathogenicity island published as PAI-V in UTI89 and 536 or PAI-I in APEC O1 and CFT073 ends just before module number 6.
Figure 10
Figure 10. Standardized cumulative sum of effective gene conversion rate and G+C content.
Gene conversion rate (i.e., probability of being involved in a gene conversion event Cgc.Lgc) is shown in blue, and G+C content in red. A decrease in the cumulative sum reflects regions of lower-than-expected values of the statistics. Around the terminus domain, we found a decrease in both recombination and G+C content. Coloured boxes represent the 4 different organisation macrodomains (Right, Ter, Left, Ori). The arrows point towards the origin and terminus of replication.

V体育2025版 - References

    1. Bachmann BJ. Derivations and genotypes of some mutant derivatives of Escherichia coli K-12. In: Neidhardt F, et al., editors. Escherichia coli and Salmonella typhimurium: cellular and molecular biology. Washington, DC: American Society for Microbiology; 2004. [Online.] http://www.ecosal.org.
    1. Hobman JL, Penn CW, Pallen MJ. Laboratory strains of Escherichia coli: model citizens or deceitful delinquents growing old disgracefully? Molecular Microbiol. 2007;64:881–885. - PubMed
    1. Savageau MA. Escherichia coli habitats, cell types, and molecular mechanisms of gene control. Am Nat. 1983;122:732–744.
    1. Donnenberg MS. Escherichia coli. Virulence mechanisms of a versatile pathogen. Baltimore: Academic press, Elsevier Science; 2002.
    1. Rolland K, Lambert-Zeschovsky N, Picard B, Denamur E. Shigella and enteroinvasive Escherichia coli strains are derived from distinct ancestral strains of E. coli. Microbiology. 1998;144:2667–2672. - PubMed

Publication types

Substances